 # Multi-Class Regression on MNIST

• What tutorial are you running?
Bayesian Regression
• What version of Pyro are you using?
1.2.1

Dear Pyro-Team,
I am trying to perform softmax regression on MNIST using mini-batch SGD with a mean-field variational density and multi-variate priors with diagonal covariance over the weight matrix and bias vector. The loss reduces, however accuracy over a training batch is always zero.

The respective code:

``````class SoftmaxRegression(PyroModule):
def __init__(self, in_features, out_features):
super().__init__()
self.linear = PyroModule[nn.Linear](in_features, out_features)

# Multi-variate Normal priors for weight matrix and bias vector
self.linear.weight = PyroSample(
prior=dist.Normal(0., 1.).expand([out_features, in_features]).to_event(self.linear.weight.dim())
)
self.linear.bias = PyroSample(
prior=dist.Normal(0., 10.).expand([out_features]).to_event(self.linear.bias.dim())
)

def forward(self, x, y=None):
# Forward method defines the likelihood function of the statistical model with mean f(x)
mean = self.linear(x)
# Define Categorical likelihood over i.i.d. data set, i.e. for each data point a separate likelihood
with pyro.plate('data', size=x.shape):
likelihood = pyro.sample('likelihood', dist.Categorical(logits=mean), obs=y)
return mean

def train():
pyro.clear_param_store()

num_epochs = 10

model = SoftmaxRegression(28*28, 10)
variational_density = AutoDiagonalNormal(model=model)

svi = SVI(model=model, guide=variational_density, optim=optimizer, loss=Trace_ELBO())

for itr in range(batches_per_epoch * num_epochs):
x, y = data_generator.__next__()
x = x.view(-1, 28*28)
loss = svi.step(x, y)

if itr % batches_per_epoch == 0:
posterior_predictive = Predictive(model=model, guide=variational_density, num_samples=50,
return_sites=('likelihood', '_RETURN'))
predictive_samples = posterior_predictive(x)
predictive_mean = torch.mean(predictive_samples['_RETURN'], dim=0)
y = one_hot(np.array(y.numpy()), 10)
target = np.argmax(y, axis=1)
pred = np.argmax(predictive_mean, axis=1)
acc = np.sum(pred == target) / 64.
print('Training Batch Accuracy: {} | Loss: {}'.format(acc, loss / len(train_loader)))
``````

This indicates that there are some bugs in your evaluation code. I would suggest to print out line-by-line results to see which code is wrong.

Dear fehiepsi,

I was suspecting I had made some kind of mistake in the model, but since you suggest the evaluation I will gladly take a closer look.

Regards!

I was so focused on pyro, that I didn’t realize I forgot to convert my predictions to numpy. Naturally, the line

`acc = np.sum(pred_train == target_train) / 64.`

didn’t work.

Regards!

Dear Pyro-Forum,

I have a follow-up question regarding the same model. I would like to compute my posterior predictive’s neg. log-likelihood, i.e. the probability of observing the data calculated under the posterior predictive’s log-density.

I understand this can be done similar to this post or this example. However, this involves annotating the prior definitions in the model using plate statements. Unfortunately I haven’t quite understood how this would work, especially in conjunction with PyroSample.

Currently, when executing:

``````pred = Predictive(model=model, guide=variational_density, num_samples=10)
pred.get_vectorized_trace(data)
``````

I am getting the following error:

``````.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1370, in linear
RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D
Trace Shapes:
Param Sites:
Sample Sites:
linear.weight dist 10  1 | 1 2
value    10 | 1 2
linear.bias dist 10  1 | 1
value    10 | 1
``````

I have tried to change the prior definition to:

``````with pyro.plate('x_axis', size=in_features):
with pyro.plate('y_axis', size=out_features):
self.linear.weight = PyroSample(
prior=dist.Normal(weight_loc, weight_scale)
)

with pyro.plate('bias', size=out_features):
self.linear.bias = PyroSample(
prior=dist.Normal(bias_loc, bias_scale)
)
``````

in order to get independent batch dimensions, i.e. assume all weights to be i.i.d. However, this leads to:

``````.local/lib/python3.6/site-packages/pyro/util.py", line 288, in check_site_shape
'- .permute() data dimensions']))
ValueError: at site "linear.weight", invalid log_prob shape
Expected [], actual [1, 2]
Try one of the following fixes:
- enclose the batched tensor in a with plate(...): context
- .to_event(...) the distribution being sampled
- .permute() data dimensions
``````

Any help is much appreciated.

Regards!

@ThinkPad I think the issue is PyTorch `nn.Linear` does not work with a batch of weights, so `pred.get_vectorized_trace` won’t work. A workaround is to run `pred(data)` to get `loc` and `scale` samples (you might use `pyro.deterministic` to declare that you want to record those values) and manually compute log-likelihood `loglik = dist.Normal(loc, scale).log_prob(obs)`

Dear @fehiepsi,

``````posterior_predictive = Predictive(model=model, guide=variational_density, num_samples=num_mc_samples,
For the SoftMax-Regression Model over MNIST I get a `-log_likelihood` of about 1.3 and for the same model over CIFAR10 `-log_likelihood` of about 29.4. Thus, the neg. log-likelihood of observing the data under this simple model drastically increases for more complicated data, which is what one would expect.