Problems with predictive for MCMC+NN

snow · August 6, 2019, 4:31pm

Hello,

This is my code:

import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.nn.functional as F
import pyro
import pyro.distributions as dist
from pyro.infer.mcmc import MCMC, HMC, NUTS
from pyro.infer.mcmc.api import MCMC
import pyro.poutine as poutine

from pyro.infer.mcmc.util import predictive
from pyro.distributions.util import sum_rightmost

from torch.autograd import Variable
import matplotlib.pyplot as plt

pyro.set_rng_seed(42)

N = 50 # Size of the dataset
X_data = torch.rand(N,1) # Sampling of N uniformly distributed points
a, b = 10, 5
sigma = 5
Y_data = a * X_data + b + dist.Normal(loc=0, scale=sigma).sample([N,1]) # Computing Y_data with normal noise

class NNModel(nn.Module):
def __init__(self, input_dim, output_dim):
    super(NNModel, self).__init__() 
    self.L1 = nn.Linear(input_dim, output_dim) 
    
    
def forward(self, x):
    output = self.L1(x)
    return output.squeeze(-1)
def model(x):
L1w_prior = dist.Normal(loc=torch.zeros_like(Net.L1.weight), scale=torch.ones_like(Net.L1.weight))
L1b_prior = dist.Normal(loc=torch.zeros_like(Net.L1.bias), scale=torch.ones_like(Net.L1.bias))
sigma = pyro.sample('sigma', dist.Uniform(0,1))

priors = {'L1.weight': L1w_prior, 'L1.bias': L1b_prior, 'sigma': sigma}

lifted_module = pyro.random_module("module", Net, priors)
lifted_net = lifted_module()


with pyro.plate("map", len(x)):
    prediction = lifted_net(x)
    return pyro.sample("obs", dist.Normal(prediction, sigma))
def conditioned_model(model, x, y):
return poutine.condition(model, data={“obs”:y})(x)

Net = NNModel(1,1)

nuts_kernel = NUTS(conditioned_model)
mcmc = MCMC(nuts_kernel, num_samples=60, warmup_steps=0, num_chains=1)

mcmc.run(model, X_data, Y_data)
mcmc.summary()

from pyro.infer.mcmc.util import predictive
samples = mcmc.get_samples()

trace = predictive(conditioned_model, samples, model, X_data, Y_data, return_trace=True)

I am getting this error:

RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D

when I am run this line:

trace = predictive(conditioned_model, samples, model, X_data, Y_data, return_trace=True)

I tried to fix the error by reshaping the size of the weights and bias samples (with view), but I still get a different error at the end.

Thank you for your attention.

neerajprad · August 6, 2019, 6:12pm

I would suggest avoiding pyro.module when you use MCMC since it doesn’t give you anything, and involves a few indirections that you could easily avoid. It will also make it tough to do any auto-batching that predictive relies on. Instead, you could just write your model as follows (if you need to make w, b matrices (or vectors), you will just need to wrap the batch dims in a pyro.plate):

def model(x):
    w = pyro.sample('w', dist.Normal(loc=torch.tensor(0.),
                                     scale=torch.tensor(1.)))
    b = pyro.sample('b', dist.Normal(loc=torch.tensor(0.),
                                     scale=torch.tensor(1.)))
    sigma = pyro.sample('sigma', dist.Uniform(0, 1))

    with pyro.plate("map", len(x)):
        mu = x * w + b
        return pyro.sample("obs", dist.Normal(mu, sigma))

Once Add option for sequential prediction to `mcmc.predictive` · Issue #1995 · pyro-ppl/pyro · GitHub is fixed, you should also be able to use your existing model by having predictive sequentially play the traces from the posterior. I should have a fix for this by end of day.

snow · August 6, 2019, 6:44pm

Thank you for your answer! But eventually, I want to use an MCMC with a more complex NN as an RNN, so from what I understand, your solution will not work?

neerajprad · August 6, 2019, 6:51pm

I am skeptical about MCMC’s performance on any reasonable sized NN, but if you are able to get it to work, the predictive utility will be the least of your problems. I should have a fix for this issue soon. I would suggest looking at SVI instead, and even then bayesian NNs rarely work out of the box except for some simple cases like in the bayesian regression tutorial.

snow · August 6, 2019, 7:02pm

I am skeptical about MCMC’s performance on any reasonable sized NN, but if you are able to get it to work, the predictive utility will be the least of your problems. I should have a fix for this issue soon.

I also believe that the MCMC’s performance will be disastrous (I will run it on a supercomputer), but this is a research project and I have been asked to try the MCMC. Eventually, I will also have to try with the SVI.

[…] and even then bayesian NNs rarely work out of the box except for some simple cases like in the bayesian regression tutorial.

Do you have any advice other than being tenacious? I really like Pyro, but I think it has a steep learning curve.

neerajprad · August 6, 2019, 7:38pm

Do you have any advice other than being tenacious? I really like Pyro, but I think it has a steep learning curve.

I just meant that it is an inherently hard topic and an area of active research, not due to any limitations in Pyro itself. If its a small sized problem, HMC could work (and you can try out NumPyro’s HMC which will be much faster). Here is a bayesian NN example in NumPyro. If you looking to do anything non-trivial, you are in uncharted territory however.

snow · August 6, 2019, 8:12pm

Alright thanks, I will take a look at NumPyro!

I tried this code from part 1 of the Bayesian regression tutorial. I only added an MCMC and it really doesn’t work. Is that the bug you said you were going to fix?

I also have to import the MCMC from api:

from pyro.infer.mcmc.api import MCMC

This is the code:

import os
from functools import partial
import numpy as np
import pandas as pd
import seaborn as sns
import torch
import torch.nn as nn

import matplotlib.pyplot as plt

import pyro
from pyro.distributions import Normal, Uniform, Delta
from pyro.infer import SVI, Trace_ELBO
from pyro.optim import Adam
from pyro.distributions.util import logsumexp
from pyro.infer import EmpiricalMarginal, SVI, Trace_ELBO, TracePredictive
from pyro.infer.mcmc import MCMC, NUTS

from pyro.infer.mcmc.api import MCMC

import pyro.optim as optim
import pyro.poutine as poutine

# for CI testing
smoke_test = ('CI' in os.environ)
assert pyro.__version__.startswith('0.3.4')
pyro.enable_validation(True)
pyro.set_rng_seed(1)
pyro.enable_validation(True)

DATA_URL = "https://d2fefpcigoriu7.cloudfront.net/datasets/rugged_data.csv"
data = pd.read_csv(DATA_URL, encoding="ISO-8859-1")
df = data[["cont_africa", "rugged", "rgdppc_2000"]]
df = df[np.isfinite(df.rgdppc_2000)]
df["rgdppc_2000"] = np.log(df["rgdppc_2000"])

data = torch.tensor(df.values, dtype=torch.float)
x_data, y_data = data[:, :-1], data[:, -1]

class RegressionModel(nn.Module):
    def __init__(self, p):
        # p = number of features
        super(RegressionModel, self).__init__()
        self.linear = nn.Linear(p, 1)
        self.factor = nn.Parameter(torch.tensor(1.))

    def forward(self, x):
        return self.linear(x) + (self.factor * x[:, 0] * x[:, 1]).unsqueeze(1)

p = 2  # number of features
regression_model = RegressionModel(p)

def model(x_data, y_data):
    # weight and bias priors
    w_prior = Normal(torch.zeros(1, 2), torch.ones(1, 2)).to_event(1)
    b_prior = Normal(torch.tensor([[8.]]), torch.tensor([[1000.]])).to_event(1)
    f_prior = Normal(0., 1.)
    priors = {'linear.weight': w_prior, 'linear.bias': b_prior, 'factor': f_prior}
    scale = pyro.sample("sigma", Uniform(0., 10.))
    # lift module parameters to random variables sampled from the priors
    lifted_module = pyro.random_module("module", regression_model, priors)
    # sample a nn (which also samples w and b)
    lifted_reg_model = lifted_module()
    with pyro.plate("map", len(x_data)):
        # run the nn forward on data
        prediction_mean = lifted_reg_model(x_data).squeeze(-1)
        # condition on the observed data
        pyro.sample("obs",
                    Normal(prediction_mean, scale),
                    obs=y_data)
        return prediction_mean
    
nuts_kernel = NUTS(model)
mcmc = MCMC(nuts_kernel, num_samples=100, warmup_steps=50, num_chains=1)
mcmc.run(x_data, y_data)

from pyro.infer.mcmc.util import predictive
samples = mcmc.get_samples()
trace = predictive(model, samples, x_data, y_data, return_trace=True)

It gives:

RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D

neerajprad · August 6, 2019, 8:37pm

Check out part 2 of the tutorial that uses MCMC on the same dataset. This is the same issue as earlier, namely, getting your batch dimensions to align with pyro.plate when using pyro.module. pyro.module calls pyro.sample internally and if you are sampling anything else but pytorch scalars you will need to account for the batch dims by using pyro.plate. This restriction might seem cumbersome, but it is needed to correctly do vectorized predictions. In many cases, you probably can get fast enough predictions without this vectorization, and I will add an option to do just that using predictive.

You can also just write your own sequential predictive function (not tested) which should work with pyro.module:

def predict_mcmc(model, model_samples, *args, **kwargs):
    preds = []
    for i in range(len(model_samples)):
        model_trace = poutine.trace(poutine.condition(model, model_samples)).get_trace(*args, **kwargs)
        preds.append(model_trace.nodes['obs']['value'])
    return torch.stack(preds)

samples = [{k: v[i] for k, v in mcmc.get_samples().items()} for i in range(num_samples)]
preds = predictive(model, samples, x_data, None)