# [Newbie] How to get the loss value between every sample of a MCMC

Hello, I am totally new to Pyro and I am not sure to understand how I can get the loss value between every sample of a MCMC.

Also, I’m not quite sure why the function model is called multiple times (often many more than the number of samples).

get the loss value between every sample of a MCMC

Hi, can you clarify what you mean by loss value? Do you mean the unnormalized probability of each sample?

why the function model is called multiple times

In Metropolis-Hastings-based MCMC algorithms, proposed samples are sometimes rejected, meaning that the unnormalized density might need to be evaluated multiple times to produce a single sample. See Chapter 3 of Bayesian Methods for Hackers for an intuitive introduction to MCMC.

1 Like

If you want to get potential energy of the model given a collection of samples, then I think the fastest way is to use predictive utility. For example, consider a logistic regression model

``````import torch
import pyro
import pyro.distributions as dist
from pyro.infer.mcmc.api import MCMC
from pyro.infer.mcmc.nuts import NUTS

dim = 3
data = torch.randn(2000, dim)
true_coefs = torch.arange(1., dim + 1.)
labels = dist.Bernoulli(logits=(true_coefs * data).sum(-1)).sample()

def model(data):
with pyro.plate('plate_beta', dim):
coefs = pyro.sample('beta', dist.Normal(0., 1.))
with pyro.plate('plate_y'):
y = pyro.sample('y', dist.Bernoulli(logits=coefs.matmul(data.t())), obs=labels)
return y

nuts_kernel = NUTS(model, jit_compile=True, ignore_jit_warnings=True)
mcmc = MCMC(nuts_kernel, num_samples=500, warmup_steps=100)
mcmc.run(data)
samples = mcmc.get_samples()
``````

then you can get log probabilities as follows

``````from pyro.infer.mcmc.util import predictive
from pyro.distributions.util import sum_rightmost

trace = predictive(model, samples, data, return_trace=True)
trace.compute_log_prob()
pe = 0.
for site in trace.nodes.values():
if site['type'] == 'sample':
pe = pe - sum_rightmost(site['log_prob'], -1)
print(pe)
``````

However, you have to write correct `plate` statements in addition to using `coefs.matmul(data.t()))` instead of `(coefs * data).sum(-1)`.

Otherwise, you can use the slower way: compute potential energy for each sample

``````pe = []
for i in range(500):
sample = {k: v[i] for k, v in samples.items()}
# transform sample to unconstrained space
for k, transform in mcmc.transforms.items():
sample[k] = transform(sample[k])
pe.append(mcmc.kernel.potential_fn(sample))
``````

Alternative, you can use numpyro if you want to record many information such as potential energy, num_steps, accept_prob, step_size, inverse_mass_matrix,…

2 Likes

Hi, can you clarify what you mean by loss value? Do you mean the unnormalized probability of each sample?

I may be wrong in my understanding of the MCMC, but I will try to explain my thinking. Let’s consider that my model is a neural network.

For each sample of the MCMC:

1. The neural network (NN) makes a forward pass to calculate the loss on the examples;
2. The current NN parameters and loss are then transferred to the MCMC;
3. The MCMC calculates new parameters for the NN;

So what I want to plot is the computed loss for each sample.

In Metropolis-Hastings-based MCMC algorithms, proposed samples are sometimes rejected, meaning that the unnormalized density might need to be evaluated multiple times to produce a single sample. See Chapter 3 of Bayesian Methods for Hackers for an intuitive introduction to MCMC.

I thought the rejected samples were counted as samples since they convey the information that the MCMC could be in a high probability area.

Hi, can you clarify what you mean by loss value? Do you mean the unnormalized probability of each sample?

I may be wrong in my understanding of the MCMC, but I will try to explain my thinking. Let’s consider that my model is a neural network.

For each sample of the MCMC:

1. The neural network (NN) makes a forward pass to calculate the loss on the examples;
2. The current NN parameters and loss are then transferred to the MCMC;
3. The MCMC calculates new parameters for the NN;

So what I want to plot is the computed loss for each sample. Does it make any sense?

In Metropolis-Hastings-based MCMC algorithms, proposed samples are sometimes rejected, meaning that the unnormalized density might need to be evaluated multiple times to produce a single sample. See Chapter 3 of Bayesian Methods for Hackers for an intuitive introduction to MCMC.

I thought the rejected samples were counted as samples since they convey the information that the MCMC could be in a high probability area.

If you want to get potential energy of the model given a collection of samples, then I think the fastest way is to use predictive.