How is the inverse mass matrix computed in HMC?

Is the inverse mass matrix in HMC same as the covariance matrix of the corresponding samples obtained from it? If not, how is inverse mass matrix computed?

no. it’s something that’s dynamically adjusted during warm-up and will not equal the covariance of the samples. see e.g. the relevant stan documentation or any discussion of nuts/hmc in the literature

1 Like

Thank you for the quick response @martinjankowiak. But, the documentation that you have linked still does not mention how exactly is the inverse mass matrix computed. Can you please provide the appropriate document for reference.

I had raised this question based on Michael Betancourt 's article which mentioned this:

It will be really great if you could please help me understand the mistake that I am making. Thanks a lot!

the mass matrix only uses warm-up samples: not all samples. also it’s usually regularized somehow as opposed to a raw covariance

see e.g. here or just generally google for strings like “mass matrix adaptation hmc”

So, based on the above conversation, I tried out the following code:

J = 8
y = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0])
sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0])
rng_key = random.PRNGKey(0)

def eight_schools(J, sigma, y=None):
    mu = numpyro.sample('mu', dist.Normal(2, 5))
    tau = numpyro.sample('tau', dist.HalfCauchy(5))
    with numpyro.plate('J', J):
        theta = numpyro.sample('theta', dist.Normal(mu, tau))
        numpyro.sample('obs', dist.Normal(theta, sigma), obs=y)

mcmc = MCMC(NUTS(eight_schools, step_size = step, inverse_mass_matrix = matrix, regularize_mass_matrix = False), num_warmup = 50, num_samples = 0)
mcmc.warmup(rng_key,J,sigma,y=y, collect_warmup=True)
warmup_samples = mcmc.get_samples()['theta']

mu_std = jnp.std(mcmc.get_samples()['mu'], axis=0)
tau_std = jnp.std(mcmc.get_samples()['tau'], axis=0)
theta_std = jnp.std(warmup_samples, axis=0)

I compared the square of the above obtained values(mu_std, tau_std, theta_std) from the inverse_square_matrix obtained using the mcmc.last_state. Despite keeping regularize_mass_matrix = False, I am still getting quite varied values. Can you please help me figure out what else am I missing out? Thanks a lot!

if you want to follow the adaptation routine in detail i suggest taking a look at the relevant code

1 Like