Training a partial Bayesian NN

VSDV · July 15, 2021, 7:10pm

I took a resnet18 and replaced the last fully connected layer with a PyroLinear on MNIST data

resnet = torchvision.models.resnet18(pretrained=True)
resnet.conv1 = nn.Conv2d(in_channels = in_channels, out_channels=resnet.conv1.out_channels, kernel_size=3, stride=1, padding=0, bias=False)
resnet.maxpool = nn.Identity()
fc_layer =nn.Sequential(nn.Linear(resnet.fc.in_features, n_classes))
to_pyro_module_(fc_layer)
resnet.fc = fc_layer
for m in resnet.fc.modules():
    for name, value in list(m.named_parameters(recurse=False)):
      setattr(m, name, PyroSample(prior=dist.Normal(0, 1)
                                        .expand(value.shape)
                                        .to_event(value.dim())
                                        ))

So only the last fc layer is Bayesian; the rest of the layers are not frozen (i.e. requires_grad=True). But when I train the whole thing with SVI, even the non bayesian parameters are updated.

Question: How are the pytorch “param store” and pyro param store interacting?