Shape mismatch while adapting Bayesian Regression tutorial

jhmnk · February 18, 2020, 3:26pm

So i am using the tutorial here as my code basis: Bayesian Regression - Inference Algorithms (Part 2) — Pyro Tutorials 1.8.6 documentation

In the tutorial, ruggedness, log_gdp and is_cont_africa are of dimension 1 with 170 samples. The shape of the vectors is therefore (170,). mean is of shape (s,) while sigma is just a number.

I am now adapting this tutorial to my data, which has an x of shape (s, features) and y of shape (s, labels). Basically it’s the same just with one additional dimension. I have created a polynomial regression term which seems to work out (n = labels, m = features):

A = pyro.sample("a", dist.Normal(torch.zeros(n, m), 10. * torch.ones(n, m)))
B = pyro.sample("b", dist.Normal(torch.zeros(n, m), 10. * torch.ones(n, m)))
C = pyro.sample("c", dist.Normal(torch.zeros(n), 10. * torch.ones(n)))
mean = (torch.mm(A, (x * x).T) + torch.mm(B, x.T) + C.view(-1, 1)).T
sigma = pyro.sample("sigma", dist.Uniform(torch.zeros(n), 10. * torch.ones(n)))

Mean is of shape (s, features), sigma is of shape (features, ), analogous to the tutorial.

However, this code makes problems:

with pyro.plate("data", x.shape[0]):
    pyro.sample("obs", dist.Normal(mean, sigma), obs=y)

In my case features = 180; labels = 177; s = 10540

ValueError: Shape mismatch inside plate(‘data’) at site obs dim -1, 10540 vs 177
Trace Shapes:
Param Sites:
Sample Sites:
a dist 177 180 |
value 177 180 |
b dist 177 180 |
value 177 180 |
c dist 177 |
value 177 |
sigma dist 177 |
value 177 |
data dist |
value 10540 |

What am i getting wrong here? I have placed some asserts so make sure the shapes are identical to the tutorial except for the extra dimension. Thanks so much!

It seems that dist.Normal(mean, sigma) is the culprit. In the tutorial, this is dist.Normal( shape: (177), number) while for me that is dist.Normal( shape: (s, labels), shape(labels,)). I guess it has to look differently somehow?

fehiepsi · February 19, 2020, 1:29am

@jhmnk I think that you can resolve the issue with some plate statements. Let me know if you get any trouble to make your model work.

jhmnk · February 20, 2020, 8:43am

Thanks, i had already seen that page and have now reread it to get a better picture.

So if my understanding is correct, then i have two independent dimensions on my data.

samples
labels

As i understand it, i should therefore nest two plate statements like this:

with pyro.plate("samples", x.shape[0]):
    with pyro.plate("labels", y.shape[1]):
        d = dist.Normal(mean, sigma)
        pyro.sample("obs", d, obs=y)

However, this also fails with a similar error:

ValueError: Shape mismatch inside plate(‘labels’) at site obs dim -2, 177 vs 10540
Trace Shapes:
Param Sites:
Sample Sites:
a dist 177 180 |
value 177 180 |
b dist 177 180 |
value 177 180 |
c dist 177 |
value 177 |
sigma dist 177 |
value 177 |
samples dist |
value 10540 |
labels dist |
value 177 |
Trace Shapes:
Param Sites:
Sample Sites:

I have a feeling my problem is with understanding event_shape in pyro somehow. As i learn best by example, would it be possible either to tell me the correct plate statement or point me to a similar example?

My data again:
features of shape=(samples, features)
labels of shape=(samples,labels)
variable mean in the code is of shape=(samples, labels) while sigma is of shape=(labels,)

In theory, every mapping from features to the individual label can be independent as well as every sample that is independent.

jhmnk · February 20, 2020, 9:18am

So this other tutorial has the same issue. Apparently it was coded to support more than 1 label:

class BayesianRegression(PyroModule):
        def __init__(self, in_features, out_features):
            super().__init__()
            self.linear = PyroModule[nn.Linear](in_features, out_features)
            self.linear.weight = PyroSample(dist.Normal(0., 1.).expand([out_features, in_features]).to_event(2))
            self.linear.bias = PyroSample(dist.Normal(0., 10.).expand([out_features]).to_event(1))

        def forward(self, x, y=None):
            sigma = pyro.sample("sigma", dist.Uniform(0., 10.))
            mean = self.linear(x).squeeze(-1)
            with pyro.plate("data", x.shape[0]):
                obs = pyro.sample("obs", dist.Normal(mean, sigma), obs=y)
            return mean

It works fine if i use the data from the tutorial, where the label tensor is of shape=(samples,). However, as soon as the label tensor has the shape=(samples,labels) i receive a ValueError almost identical to the one i posted, so it’s clearly not designed to handle multiple labels. Even a label tensor of shape=(samples,1) does not work.

edit: i also tried transposed tensors in all possible combinations without much effect

stevekx · May 31, 2020, 5:15pm

Hi There!

Did you figure out what happened here? I’m also modifying this tutorial with my own dataset here, it has 35987 samples, 2304 features, and 7 labels. I’m facing the exact same problem. It’d be really helpful if you could help me with this! Thank you!

Steve

jhmnk · June 2, 2020, 7:39am

Sorry, didn’t have time to pursue this fully, so i did not solve it iirc.

fehiepsi · June 5, 2020, 4:24am

@jhmnk Sorry! I missed the follow up.

@stevekx Could you provide a reproducible code, probably with toy datasets? I think I can help resolving the issue.