i was reading the tutorial on BNNs implementation in pyro (Tutorial 1: Bayesian Neural Networks with Pyro — UvA DL Notebooks v1.2 documentation). In the first example, it is described how to use NUTS MCMC algorithm to sample from the posterior. Everything works ok but when I change the function and create bimodal function consisting of two sinusoids. Then my posterior predictive samples are not good.
Here are the generated points of my bimodal distribution:
And here are the points from posterior predictive distribution. It looks that for each x the Y will be Gaussian.
Model is the following:
import pyro import pyro.distributions as dist from pyro.nn import PyroModule, PyroSample import torch.nn as nn class MyFirstBNN(PyroModule): def __init__(self, in_dim=1, out_dim=1, hid_dim=5, prior_scale=10.): super().__init__() self.activation = nn.Tanh() # or nn.ReLU() self.layer1 = PyroModule[nn.Linear](in_dim, hid_dim) # Input to hidden layer self.layer2 = PyroModule[nn.Linear](hid_dim, out_dim) # Hidden to output layer # Set layer parameters as random variables self.layer1.weight = PyroSample(dist.Normal(0., prior_scale).expand([hid_dim, in_dim]).to_event(2)) self.layer1.bias = PyroSample(dist.Normal(0., prior_scale).expand([hid_dim]).to_event(1)) self.layer2.weight = PyroSample(dist.Normal(0., prior_scale).expand([out_dim, hid_dim]).to_event(2)) self.layer2.bias = PyroSample(dist.Normal(0., prior_scale).expand([out_dim]).to_event(1)) def forward(self, x, y=None): x = x.reshape(-1, 1) x = self.activation(self.layer1(x)) mu = self.layer2(x).squeeze() sigma = pyro.sample("sigma", dist.Gamma(.5, 1)) # Infer the response noise # Sampling model with pyro.plate("data", x.shape): obs = pyro.sample("obs", dist.Normal(mu, sigma * sigma), obs=y) return mu
I use NUTS to sample from posterior:
model = MyFirstBNN() nuts_kernel = NUTS(model) mcmc = MCMC(nuts_kernel, warmup_steps=100,num_samples=300) # Convert data to PyTorch tensors x_train = torch.from_numpy(x_obs).float().unsqueeze(1) y_train = torch.from_numpy(y_obs).float() # Run MCMC mcmc.run(x_train, y_train)
And I generate posterior predictive using the following code:
predictive = Predictive(model=model, posterior_samples=mcmc.get_samples(), return_sites=("obs", "_RETURN")) x_test = torch.linspace(0,1, 100) preds = predictive(x_test.unsqueeze(1))
Am I doing something wrong (too small number of samples, maybe there is some other NUTS settings) or it is just impossible to use BNN with Gaussian posterior predictive for such cases? I was reading paper Hands-on Bayesian Neural Networks – a Tutorial for Deep Learning Users
([2007.06823] Hands-on Bayesian Neural Networks -- a Tutorial for Deep Learning Users) which actually claims that it should be possible with MCMC BNN.
PS. If I add only one mode (one sinusoid instead of two), then the posterior predictive fit is good.
Thank you in advance!