Bayesian CNN

hmd · July 30, 2018, 4:28pm

Hi,

I am new to pyro. To try it I wanted to train a CNN module using bayesian inference using pyro.

Can Anyone help me to transform it to random_module pyro?
The network is the following:

class NetC(nn.Module):
    def __init__(self, nc, nclass):
        super(NetC, self).__init__()
        self.nclass = nclass
        self.convlayers = nn.Sequential(
            nn.Conv2d(nc, 32, kernel_size=(2, 2), stride=1),
            nn.BatchNorm2d(32),
            nn.ReLU(False),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(32, 64, kernel_size=2, stride=1),
            nn.BatchNorm2d(64),
            nn.ReLU(False),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(64, 64, kernel_size=2, stride=1),
            nn.BatchNorm2d(64),
            nn.ReLU(False),         
            nn.MaxPool2d(kernel_size=2),  
            nn.Conv2d(64, 128, kernel_size=2, stride=1),
            nn.BatchNorm2d(128),
            nn.ReLU(False),
        )
        self.fc = nn.Sequential(
            nn.Linear(128*6*6, 2048),  # FC
            nn.Dropout(),
            nn.ReLU(False),
            nn.Linear(2048, 2048),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(2048, self.nclass),  # classifier
        )
    def forward(self, x):
        x0 = self.convlayers(x)
        x0 = x0.view(x0.size(0), -1)
       return self.fc(x0)

(edited by @fritzo to format python code)

martinjankowiak · July 31, 2018, 12:07am

please be advised that using vanilla stochastic variational inference to learn a bayesian neural network model with so many parameters is exceedingly unlikely to work. for one thing, the variance of the gradients will be extremely large.

if you work a bit harder you can do things like what’s done in, for example, this paper

but even there learning the model is going to be quite difficult.

pradogusto · October 11, 2018, 2:32pm

What about this to deal with complex model structure (like CNN):

Start with standard NN optimization, by minimizing mse with a standard pytorch optimizer, to reach a reasonable local minimum.
Transfer weights to a Bayseian Network and finish learning with pyro SVI steps.

Has someone tried to implement this kinf of process ?

martinjankowiak · October 12, 2018, 2:16pm

that is unlikely to work for a number of reasons. one is that the KL divergence term will tend to over-regularize the weights, so once you switch to SVI you’ll quickly ‘destroy’ the MSE solution you found

Abinash · July 8, 2024, 6:49pm

I have also posted a similar problem where I am trying to solve a regression problem using a CNN model and trying to use SVI on top of it. Is it still possible with the latest version of the pyro?