Bayesian regression predictions

  • What tutorial are you running? Bayesian regression (Part I)
  • What version of Pyro are you using? 0.3.3

Hi, I’ve been following this tutorial to implement a Bayesian nnet in Pyro, and I’m being able to follow it till the prediction step, where I get a bit confused about the sampling pipeline; in particular, my questions are:

Considering the model and the evaluation code below:

def model(x_data, y_data):
   # ... I'm commenting the prior definition out
    lifted_module = pyro.random_module("module", regression_model, priors)
    lifted_reg_model = lifted_module()
    with pyro.plate("map", len(x_data)):
        prediction_mean = lifted_reg_model(x_data).squeeze(-1)
        pyro.sample("obs", Normal(prediction_mean, scale), obs=y_data)
        return prediction_mean

def evaluate_model(svi, x_data, y_data):
    posterior = svi.run(x_data, y_data)
    post_pred = TracePredictive(wrapped_model, posterior, num_samples=1000).run(x_data, None)
    marginal = EmpiricalMarginal(post_pred, ['obs'])._get_samples_and_weights()[0].detach().cpu().numpy()
  1. Inside svi.run(x_data, y_data), Pyro is internally executing:
for i in range(self.num_samples):
    guide_trace = poutine.trace(self.guide).get_trace(*args, **kwargs)
    model_trace = poutine.trace(poutine.replay(self.model, trace=guide_trace)).get_trace(*args, **kwargs)
    yield model_trace, 1.0

I understand that poutine.trace(self.guide).get_trace(*args, **kwargs) is the way to sample from the trained guide (namely, the guide’s params), to be later used in the joint distribution (i.e. the model) via poutine.trace(poutine.replay(self.model, trace=guide_trace)).get_trace(*args, **kwargs).
a) Is what I’m saying correct?
b) In this last call, because I’m passing the x, y testing data to model(x_data, y_data), isn’t that gonna condition the likelihood to the y testing data (due to pyro.sample("obs", Normal(prediction_mean, scale), obs=y_data) ?

  1. After having called svi.run(...) I should have already got my posterior traces. Why do I need to call TracePredictive(...).run(x_data, None), which is internally calling resampled_trace = poutine.trace(poutine.replay(self.model, model_trace)).get_trace(*args, **kwargs) again ?

  2. Lastly, I see in the code that at several points there is a Delta distribution being instantiated with some values. Is this a trick used so that you can get these original values from a distribution object? That is, defining a distribution that has only 1 value, that can be later sampled from?

Thanks!

2 Likes

a) Is what I’m saying correct?

That’s right.

b) In this last call, because I’m passing the x, y testing data to model(x_data, y_data) , isn’t that gonna condition the likelihood to the y testing data (due to pyro.sample("obs", Normal(prediction_mean, scale), obs=y_data) ?

Yes. And this is the answer to your second question. svi.run will give you traces from the posterior, but it will condition observed sites to the initial data, so you need to resample the latent sites from these traces to generate predictions over new data.

  1. After having called svi.run(...) I should have already got my posterior traces. Why do I need to call TracePredictive(...).run(x_data, None) , which is internally calling resampled_trace = poutine.trace(poutine.replay(self.model, model_trace)).get_trace(*args, **kwargs) again ?

To get predictions, you still need to run the model forward (possibly on new data) and only resample the latent sites from the posterior distribution. TracePredictive is much more general (e.g. it deals with data subsampling) but you can always just use poutine.replay to generate predictions if you don’t need this general machinery.

  1. Lastly, I see in the code that at several points there is a Delta distribution being instantiated with some values. Is this a trick used so that you can get these original values from a distribution object?

I’m not sure which part of the code you are referring to, but we commonly use pyro.sample(dist.Delta(x, log_prob=some_custom_term) to inject some custom log density term to the trace, e.g. a jacobian adjustment term.

2 Likes

Thanks!!!

@neerajprad
From my opinion, I found that pyro currently put too much on how to train a prob model, but leave how to evaluate the model quite complex. By complex, I mean we have to read how svi.run is, how trace works, what is the traceposterior, what is trace predictive and so on to understand how pyro works in prediction.

Currently it still makes me quite confused about the subsample for tracepredictive.

Even the function name _get_samples_and_weights(), I think, by design, this function should be avoided to be used openly since it starts with the underscore.

Most of the documents are not about how we do predict in pyro. I wonder if we can write a single doc for the prediction part for a more general approach instead of only in the regression model?

I think current approaches expose too much of the details for the users. If I miss something, please help me, thanks a lot !

In general, we hope to learn a model, see the elbo loss, predict the value given the new inputs, see the learned values for the params, and see the approximated posterior distribution for the latent variables. But it seems that they are put in the sea of pyro…

1 Like

From my opinion, I found that pyro currently put too much on how to train a prob model, but leave how to evaluate the model quite complex. By complex, I mean we have to read how svi.run is, how trace works, what is the traceposterior, what is trace predictive and so on to understand how pyro works in prediction.

I am in complete agreement with you. This is something that we are refactoring right now. For the next release, the interface for doing predictions with MCMC will be simplified (also, hopefully, for SVI too, we can do the same shortly thereafter). You can follow this issue - https://github.com/pyro-ppl/pyro/issues/1930.

3 Likes

Thanks a lot !