 # Posterior Predictive Distribution using MCMC

#1

Dear fellow pyromants,

i have a question regarding how to obtain the posterior predictive distribution using MCMC techniques.
Here is the little example code i am trying to run:

``````# generate some data
np.random.seed(1)
xs = to.tensor(np.linspace(-2,2,23)).float()
f = lambda z : z*.5 + 1
ys = f(xs) + to.tensor(np.random.normal(0, .5, size=len(xs))).float()

def y_model(x, y_obs=None):
''' very simple linreg model'''
# priors
w = pyro.sample('w', pyro.distributions.Normal(loc=0, scale=10))
b = pyro.sample('b', pyro.distributions.Normal(loc=0, scale=10))

f_x = w*x + b

y = pyro.sample('y', pyro.distributions.Normal(loc=f_x, scale=1), obs=y_obs)

return f_x
``````

I can now sample models from the prior by just running
`samples_from_prior = y_model(xs, None)`

For inference, as described in the title, i want to use MCMC techniques:

``````from pyro.infer.mcmc import MCMC, NUTS
posterior_trace = MCMC(kernel=NUTS(y_model), num_samples=1337, warmup_steps=100)
posterior_trace.run(xs, ys)
``````

The `posterior_trace` is now an `pyro.infer.mcmc.mcmc` object, on which i can call `posterior_trace.marginal(sites=..).empirical[..]` to get the samples for the respective random variables “w” and “b” (and also ‘y’). And this is about how far the tutorial goes.

But now i would like to sample models `y_model` given the parameter configurations in the `posterior_trace`. I am not sure how this is ment to be done.

In case of SVI and a given guide i usually did the following:
`model_predictions = []`
`for _ in range(100): # sample 100 models (i.e. posteriors given y_obs)`

``````## trace of the guide (samples all variables given x from the guide)
guide_trace = pyro.poutine.trace(guide).get_trace(xs, None)

# runs the model using samples from the guide for the latent variables
replay_result = pyro.poutine.replay(y_model, guide_trace)(xs, None)

# append to list
model_predictions.append(replay_result.detach().numpy())
``````

which would give me 100 sampled models from the (guide-approximated) posterior predictive distribution.
How can i achieve the same using the results obtained from MCMC ?

Thanks in advance !

PS:

#2

You can use `pyro.infer.abstract_infer.TracePredictive` to get a representation of the posterior predictive distribution from an MCMC run:

``````posterior = MCMC(NUTS(model), ...).run(xs, ys)
posterior_predictive = TracePredictive(model, posterior, num_samples=100)
posterior_sample = posterior_predictive(xs)  # a Trace object
``````

See examples of the use of `TracePredictive` in the baseball example.

#3

First of all, thank you for you help ! It is greatly appreciated !

I tried to run your proposed code but it did not work, i received the following error:

``````from pyro.infer import TracePredictive
from pyro.infer.mcmc import MCMC, NUTS

posterior = MCMC(kernel=NUTS(y_model), num_samples=1337, warmup_steps=500).run(xs, ys)

posterior_predictive = TracePredictive(y_model, posterior, num_samples=100)
posterior_sample = posterior_predictive(xs)  # a Trace object

Sample: 100%|█| 1837/1837 [00:33<00:00, 55.51it/s, step size=1.12e+00, acc. rate=0.892]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-40-ce761d5a05f9> in <module>
5
6 posterior_predictive = TracePredictive(y_model, posterior, num_samples=100)
----> 7 posterior_sample = posterior_predictive(xs)  # a Trace object

~\Anaconda3\lib\site-packages\pyro\infer\abstract_infer.py in __call__(self, *args, **kwargs)
203         # we get the index from ``idxs_by_chain`` instead of sampling from
204         # the marginal directly.
--> 205         random_idx = self._categorical.sample().item()
206         chain_idx, sample_idx = random_idx % self.num_chains, random_idx // self.num_chains
207         sample_idx = self._idx_by_chain[chain_idx][sample_idx]

AttributeError: 'NoneType' object has no attribute 'sample'
``````

I also tried to look into the Baseball example, but it is a little to comprehensive for me to understand yet. I am still struggling with wrapping my head around examples that go over a few lines.

I managed to “sample models” in hacky way through `pyro.poutine.condition`:

``````sampled_parameters = posterior.marginal(sites=['w', 'b']).empirical

model = pyro.poutine.condition(y_model,
data=dict((k, v._samples.reshape(-1, 1))
for k, v in sampled_parameters.items()))
``````

But for this approach i have to provide the sites of all the parameters manually.

#4

It seems like the posterior samples are not yet populated and hence you see this error. Try the following (`.run` would generate and store forward samples by resampling sample sites from the posterior).

``````posterior_predictive = TracePredictive(y_model, posterior, num_samples=100).run(xs, None)
posterior_sample = posterior_predictive(xs)
# or
posterior_samples = posterior_predictive.marginal().empirical['_RETURN']
``````

Let me know if that does not work.

#5

Thank you very much !

``````from pyro.infer import TracePredictive
from pyro.infer.mcmc import MCMC, NUTS
#
posterior = MCMC(kernel=NUTS(y_model), num_samples=1337, warmup_steps=500).run(xs, ys)

posterior_predictive = TracePredictive(y_model, posterior, num_samples=100).run(xs, None)

sampled_model_outputs = posterior_predictive.marginal().empirical['_RETURN']._samples
``````

Works like a charm !

But i am not sure how to properly make use of the `pyro.poutine.trace_struct.Trace` object returned by
`posterior_sample = posterior_predictive(xs)`

could you elaborate on this ?

#6

The trace data structure (which depending on the version of Pyro you are using would either be a networkx.DiGraph instance or simply a Python class that wraps over dicts of nodes and edges in your graphical model), can be used to access the value of any sample site (i.e. sample from the posterior) using `trace.nodes[site_name]`. Alternatively you could also access the marginal distribution of any site directly, as is done by `sampled_model_outputs` for the `_RETURN` site:

``````posterior_predictive.marginals([site_name]).empirical[site_name]
``````

So depending on what you need you can get away with just the `.marginal` method or access the execution traces directly and post-process them for your analysis.