How to include observations from a Poisson distribution?

Hi,

I’m new to pyro, so I apologize if this is too trivial for a post, but I cannot find out what’s wrong with my code.

I wanted to replicate the results of Variational Inference: The Basics | by Hylke C. Donker | Towards Data Science with pyro, to get a feel for the models and functions before I use it to solve my real problem. Part of the solution uses a Poisson distribution, and this is where my problems started. I’ve created a small example (see below) to show how my program crashes.

The problem is that I can’t use the observed data for my Poisson distribution. It crashes with Shape mismatch inside plate(‘data_loop’) at site obsp dim -1, 10 vs 2. As the data/observations are 10x2 I thought it was just a matter of transposing the data, but that did nothing. What really confuses me is that the same code works if I sample from a multivariate normal distribution, which produces samples of the same dimension as my Poisson distribution. Since the code works with other distributions, I don’t really have any ideas of what might cause my issue.

Can anyone point me in the right direction?

import torch
import pyro
from pyro.distributions import MultivariateNormal, Poisson

N = 10
data = torch.ones(N, 2)

mu = torch.tensor([1.0, 2.0])
n_sample = pyro.sample("whatever1", MultivariateNormal(mu, torch.eye(2)))
p_sample = pyro.sample("whatever2", Poisson(mu))
assert n_sample.shape == p_sample.shape
print(n_sample, p_sample)

with pyro.plate("data_loop", N):
    print(pyro.sample("obsn", MultivariateNormal(mu, torch.eye(2)), obs=data)) # works
    #print(pyro.sample("obsp", Poisson(mu)), obs=data) # Crashes

looks like you either need an additional plate for the 2 dimension or you need to use to_event:
pyro.sample("obsp", Poisson(mu).to_event(1)), obs=data)

Thanks for your reply!

I’ve tried to play around with an additional plate, nested plates and the to_event function without any luck. I will continue to try different combinations, but the ones that make sense in my head all fail. (When I use the to_event(1) you suggested, I get a new size error that I can’t get around.) I’ve added a new version of the code below since, the example above might miss some details that I’m not aware of.

Here is a simplified version of what I want to do:

  • Each data point contains 64 pixels
  • The intensity of each pixel is an integer, modeled with a Poisson variable (the pixels are independent, so they potentially have 64 different intensities)
  • The prior of the intensities have a gamma distribution.
  • I have a set of (178) data points, each data point is a vector of size 64
    I can’t seem to get the multi-dimensional Poisson distribution to do what I want to do.
import pyro
import pyro.distributions as dist
import pyro.distributions.constraints as constraints
import torch
import logging
from sklearn import datasets

digits = datasets.load_digits()
X_train = digits.images[digits.target == 0]
X_train = torch.tensor(X_train)
print(X_train.shape)
nr_pixels = 64
X_train = X_train.reshape((-1, nr_pixels)) # 
nr_samples, nr_pixels = X_train.shape
print(nr_samples, nr_pixels)

def model(data):
    a = 16.0 * torch.ones(nr_pixels)
    b = 1.0

    theta = pyro.sample("theta", dist.Gamma(a, b))
    assert len(theta) == nr_pixels

    with pyro.plate("data_loop", nr_samples):
        pyro.sample("obs", dist.Poisson(theta).to_event(1), obs=data)

        
def guide(data):
    aq = pyro.param("aq", torch.ones(nr_pixels), constraint=constraints.positive)
    bq = pyro.param("bq", torch.tensor([1]), constraint=constraints.positive)
    pyro.sample("theta", dist.Gamma(aq, bq))

adam = pyro.optim.Adam({"lr": 0.01})
elbo = pyro.infer.Trace_ELBO()
svi = pyro.infer.SVI(model, guide, adam, elbo)
pyro.render_model(model, model_args=(X_train,), render_distributions=True)
losses = []
for step in range(100):
    loss = svi.step(X_train)
    losses.append(loss)
    logging.info(f"ELBO loss: {loss}")

edit:
some of the configurations I’ve tried:

    # Shape mismatch inside plate('data_loop') at site obs dim -1, 178 vs 64
    # with pyro.plate("data_loop", nr_samples):
    #     pyro.sample("obs", dist.Poisson(theta), obs=data)

    # Expected [], actual [64]
    # with pyro.plate("data_loop", nr_samples):
    #     pyro.sample("obs", dist.Poisson(theta).to_event(1), obs=data)

    # Expected [], actual [64]
    # for s in range(nr_samples):
    #     for p in range(nr_pixels):
    #         pyro.sample(f"obs_{s},{p}", dist.Poisson(theta[p]), obs=data[s])

    # Expected [], actual [64]
    # with pyro.plate("data_loop", nr_samples):
    #     with pyro.plate("inner_loop", nr_pixels):
    #         pyro.sample("obs", dist.Poisson(theta).to_event(1), obs=data)
    
    # Shape mismatch inside plate('data_loop') at site obs dim -1, 178 vs 64
    with pyro.plate("data_loop", nr_samples):
        with pyro.plate("inner_loop", nr_pixels):
            pyro.sample("obs", dist.Poisson(theta), obs=data)

Edit 2:
I’ve tried with a MultivariateNormal as well, and I can’t get that to work either, so something with my approach is definitely wrong, I just can’t figure out what. Sometimes I can render the model, but I’ve never been able to run a step.
Using the language here this is what I want to do:

  • Draw rates \theta from 64 different Gamma distributions (batch_shape = 64)
  • Use each rate to configure a Poisson distribution (64 different distributions)
  • Observe the 178 samples from the data (sample_shape = 178)

I’ve managed to get things running for for a single pixel, but still no luck in doing it for all 64 pixels simultaneously. For a single pixel, I only consider the data of that pixel (so data shape is now (178, )). This circumvents the issues I had with the 2-dimensional data.
With 1-dimensional data and the model/guide specified below things work OK. I need a few thousand iterations to get a decent approximate posterior, but maybe that’s expected.

If anyone has seen any examples of several parallel univariate distributions (like my Poisson and Gamma) please let me know.

def model(data):
    #prior
    a, b = torch.tensor(10.0), torch.tensor(1.0)
    theta = pyro.sample("theta", dist.Gamma(a, b))

    # likelihood
    with pyro.plate("data_plate", nr_samples):
        pyro.sample("obs", dist.Poisson(theta), obs=data)

def guide(data):
    aq = pyro.param("aq", torch.tensor(10.0))
    bq = pyro.param("bq", torch.tensor(1.0))
    pyro.sample('theta', dist.Gamma(aq, bq))

try this

def model(data):
    a = 16.0 * torch.ones(nr_pixels)
    b = 1.0

    theta = pyro.sample("theta", dist.Gamma(a, b).to_event(1))
    assert len(theta) == nr_pixels

    with pyro.plate("data_loop", nr_samples):
        pyro.sample("obs", dist.Poisson(theta).to_event(1), obs=data)


def guide(data):
    aq = pyro.param("aq", torch.ones(nr_pixels), constraint=constraints.positive)
    bq = pyro.param("bq", torch.tensor([1]), constraint=constraints.positive)
    pyro.sample("theta", dist.Gamma(aq, bq).to_event(1))


above i used to_event but you could also use plate appropriately instead

read this:
https://pyro.ai/examples/svi_part_ii.html
https://pyro.ai/examples/tensor_shapes.html
https://pyro.ai/examples/logistic-growth.html