Initialize each chain of MCMC separately

emilio · October 28, 2021, 12:40pm

Hi!

I am running NUTS in a setting where data increases over time. What I would like to do is to is to initialize the new chains with the last sample from each of the previous chains. This works fine when I only have one chain as I can extract the last sample and use init_strategy = numpyro.infer.util.init_to_value(values=lastsample) in my NUTS kernel.

This approach does not work for multiple chains as util.init_to_value does not seem to support initializing different chains with different sets of parameters. Is there any way to get this to work that I am unaware of? Or should this just be submitted as a feature request on the github page?

I tried to show the issue in a google colab below. The final two cells summarize the issue and the unsatisfactory solution of just initializing the chains to the parameters of one chain.

fehiepsi · October 29, 2021, 3:06am

I think post_warmup_sample is what you need. It will skip warmup phase on the next run with the new data. If you need to trigger warmup phase in the next run, I guess you can do

mcmc.post_warmup_state = mcmc.last_state.replace(i=jnp.array(0))

emilio · November 3, 2021, 7:39am

Thanks! this looks like what I am looking for.

janelai22 · March 2, 2022, 7:20pm

Hello,

I’m trying to do sequential MCMC too. The motivation is similar to this thread and this other thread where I have sequential data coming in and would like to update from the last state, i.e., doing this.

I would appreciate

being able to initialize the values for my multiple chains separately (and potentially initialize the inverse mass matrix separately as well in the future).
being able to rerun the warmup steps when new data are coming in.

However, so far I haven’t been able to get it to work.

As in this thread, I can’t use init_to_value since it only allows for one value. There are several problems with using “mcmc.last_state._replace(i=np.zeros(num_chains)).”

It looks like if I only want to initialize the values but not the inverse mass matrix, I’d need to take care of that myself and change all the other details in the last_state. I would deeply appreciate if I could use something as simple as init_to_value.
I’m assuming that the “warmup” step is defined by “i”? Therefore, when I do “mcmc.run,” though it will run for “num_samples” steps, the first “num_warmup” steps would be for warmup? I’m not sure if my understanding is correct. However, it does feel a bit confusing to code it up since I’d need to change “num_samples” from num_samples to num_warmup+num_samples. In addition, when I do “mcmc.get_sample,” I’d need to take the last num_samples samples myself.
I suspect that my understanding about 2) is incorrect because when I look at the inverse_mass_matrix in mcmc.last_state, somehow it is not updated. In particular, this is the experiment I ran:
mcmc = MCMC(kernel, num_warmup=1, num_samples=20, num_chains=1)
mcmc.warmup(rng_key, O=O)
mcmc.post_warmup_state=mcmc.last_state._replace(i=np.array(0))
mcmc.run(rng_key, O=O)
print(mcmc.last_state)
The inverse_mass_matrix is still identity.
(The above is just an experiment I was running. Eventually the use case would be for the observed data to be changed over time, which would also affect the posterior distribution.)

In short, I would really appreciate if anyone could give me guidance on

how to setup initial values for multiple chains separately with warmup
(for the future) how to change the inverse_mass_matrix properly with warmup

Thank you so much! I really deeply appreciate it.

fehiepsi · March 3, 2022, 3:58pm

Hi @janelai22, I think the functional api provides flexibility that you needed:

First, use initialize_model utility with dynamic_args=True (because you want to change the data later) to get model_info
First, call hmc with potential_fn_gen=model_info.potential_fn is the to get init_kernel and sample_kernel
vmap init_kernel over init_params and inverse_mass_matrix to get init states
vmap/pmap fori_collect(warmup, warmup + num_samples, partial(sample_kernel, model_args=model_args, model_kwargs=model_kwargs)) with the init_state from the previous step
get the last state, modify it and run fori_collect again

If you want to use MCMC api, then we can expose init_state to MCMC.run api and use it at those lines to MCMC.run. In addition, to setup different initial values for different chains, you will need to provide init_params to MCMC.run and make a FR/PR to not overwrite init_params here if it is not None. After those, the code would be

mcmc = MCMC(...)
# (optionally) run those statements to update inverse mass matrix
# mcmc._compile(..., init_params=batched_unconstrained_init_values)
# init_state = mcmc.last_state
# init_state = update_inverse_mass_matrix(init_state)
mcmc.run(..., init_params=batched_init_values)
next_init_state = mcmc.last_state
mcmc.run(..., init_state=next_init_state)

janelai22 · March 4, 2022, 8:01pm

Thank you so much!!! This is so helpful. The first option looks good and thanks for giving me so many details. It looks like after I changed “dynamic_args=True,” I need to pass “model_kwargs” into “fori_collect” as you suggested. However, I’m not entirely sure how to pass that info in. I’m just testing with one chain. I now have something like this:
hmc_states = fori_collect(
num_warmup,
num_warmup + num_samples,
sample_kernel,
hmc_state,
model_kwargs={‘T_max’:T_max, ‘R0’:R0, ‘A’:A, ‘O’:O, ‘RT_max’:RT_max}
)
And it gave me errors. Could you perhaps elaborate one how I should pass this information?

As for the second option, I have a quick clarification question. What do you mean by FR/PR? My previous attempt on init_params had no impact, i.e., the result looked the same regardless of whether I put “init_params” in “mcmc.run(…).” I believed something has overwritten it.

Thank you so much again! I really appreciate it!

fehiepsi · March 5, 2022, 2:20am

Hi @janelai22 I think you need partial(sample_kernel, model_args=model_args, model_kwargs=model_kwargs) as in my last comment. By FR/PR, I meant feature request or pull request at GitHub - pyro-ppl/numpyro: Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation to GPU/TPU/CPU.

janelai22 · March 9, 2022, 8:45pm

Hi,

Thank you so much! Sorry that I was confused but this was clear now! However, I got an error as shown here in the last cell. It worked out fine if I set dynamic_args=False. I’m not familiar with JAX. Would you mind telling me what might have been the cause of the error?

Thanks so much! Sorry for the delay.

fehiepsi · March 10, 2022, 12:55am

I guess you need this

call hmc with potential_fn_gen=model_info.potential_fn is the to get init_kernel and sample_kernel

janelai22 · March 14, 2022, 9:18pm

It is working nicely!!! Thank you so much!

dpiras · February 12, 2023, 11:06pm

Hi @fehiepsi! I am using the MCMC API, and would like to initialize each chain of a NUTS sampler separately. I would then like each chain to undergo the warmup phase and sampling phase as normal; for the problem I am currently facing, it seems that to explore the full posterior space I need to initialize the chains with different initial values.

My understanding is that this is not currently possible, right? I can see that a suggestion was made about a year ago; was a PR created? If it is as simple as described:

If you want to use `MCMC` api, then we can expose `init_state` to `MCMC.run` api and use it at [those lines ](https://github.com/pyro-ppl/numpyro/blob/master/numpyro/infer/mcmc.py#L540) to `MCMC.run` . In addition, to setup different initial values for different chains, you will need to provide `init_params` to `MCMC.run` and make a FR/PR to not overwrite [init_params here ](https://github.com/pyro-ppl/numpyro/blob/master/numpyro/infer/hmc.py#L643) if it is not None.

I would be happy to give it a go (assuming the hyperlinks are still valid?), as I would really need it - cheers!

martinjankowiak · February 12, 2023, 11:15pm

@dpiras note that if you need to initialize different chains differently to explore more of the posterior (because each chain does not explore all modes on its own) this kind of fix won’t actually address the underlying poor mixing and so the result isn’t particularly likely to be a great approximation to the exact posterior. of course the approximate posterior samples you get may still be useful, and this may be the best you can do for a hard inference problem, but just note that you need each chain to explore all posterior modes to get high fidelity posterior samples

dpiras · February 13, 2023, 7:27am

Thank you, @martinjankowiak! I agree with you - I am perhaps unsure as to how to explore more of the posterior in this case with NUTS then. Would something like the reparameterizations help? Or should I just let the chains warmup and sample for longer?

dpiras · February 13, 2023, 8:40am

For example (sorry for the spam), assume I have a parameter \theta for which I want to impose a starting value. I add a reparameterization in order to improve the geometry of the problem, and I apply e.g. a TransformReparam, getting \theta_{\rm{base}}. How do I now initialize \theta_{\rm{base}} given a wanted initial value for \theta?

martinjankowiak · February 13, 2023, 11:29pm

yes generally i’d suggest you use more samples and longer warmup, explore reparameterization and other strategies, and use 64-bit precision if you’re not already

Megumi · July 20, 2024, 4:08am

Hi @fehiepsi , I’m working on a mcmc in numpyro. And I want to set a initial value to my priors for example:
def model(M, N, ntob, ntmis, T, ID, ya_ob,tp, xs, xa):
jax.config.update(“jax_enable_x64”, True)
delta = jnp.float64(1e-9)
x = jnp.cumsum(jnp.ones(M))
# Priors for linear model coefficients
b1s = numpyro.sample(‘b1s’, dist.Normal(0, 5))
b1a = numpyro.sample(‘b1a’, dist.Normal(0, 5))

alpha_l_s = numpyro.sample('alpha_l_s', dist.TruncatedNormal(loc=0, scale=5, low=0))
alpha_l_a = numpyro.sample('alpha_l_a', dist.TruncatedNormal(loc=0, scale=5, low=0))
rho_l_s = numpyro.sample('rho_l_s', dist.TruncatedNormal(loc=0, scale=5, low=0))
rho_l_a = numpyro.sample('rho_l_a', dist.TruncatedNormal(loc=0, scale=5, low=0))

alpha_m_s = numpyro.sample('alpha_m_s', dist.TruncatedNormal(loc=0, scale=5, low=0))
alpha_m_a = numpyro.sample('alpha_m_a', dist.TruncatedNormal(loc=0, scale=40, low=0))
rho_m_s = numpyro.sample('rho_m_s', dist.TruncatedNormal(loc=0, scale=5, low=0))
rho_m_a = numpyro.sample('rho_m_a', dist.TruncatedNormal(loc=0, scale=5, low=0))

sig_amt = numpyro.sample('sig_amt', dist.HalfCauchy(scale=5))
L_Omega = numpyro.sample('L_Omega', dist.LKJCholesky(2, concentration=3))
delta_mu = numpyro.sample('delta_mu', dist.Normal(0, 5), sample_shape=(2,))

sig_s = numpyro.sample('sig_s', dist.HalfCauchy(scale=5))
sig_a = numpyro.sample('sig_a', dist.HalfCauchy(scale=5))
z_s = numpyro.sample('z_s', dist.Normal(loc=0, scale=1), sample_shape=(N,))
z_a = numpyro.sample('z_a', dist.Normal(loc=0, scale=1), sample_shape=(N,))

and I will use
initial_values = {
‘b1s’: jnp.float64(b1s_true + 2),
‘b1a’: jnp.float64(b1a_true - 2),
‘sig_s’: jnp.float64(sig_s_true +0.5 ),
‘sig_a’: jnp.float64(sig_a_true + 0.5),
‘delta_s’: jnp.float64(delta_s_true + 1),
‘delta_a’: jnp.float64(delta_a_true +2),
‘sig_amt’: jnp.float64(sig_amt_true + 0.2),
‘cor_e’: jnp.float64(cor_e_true + 0.4),
‘alpha_l_s’: jnp.float64(alpha_l_s_true),
‘rho_l_s’: jnp.float64(rho_l_s_true),
‘alpha_l_a’: jnp.float64(alpha_l_a_true),
‘rho_l_a’: jnp.float64(rho_l_a_true),
‘alpha_m_s’: jnp.float64(alpha_m_s_true),
‘rho_m_s’: jnp.float64(rho_m_s_true),
‘alpha_m_a’: jnp.float64(alpha_m_a_true + 0.1),
‘rho_m_a’: jnp.float64(rho_m_a_true)
}
strategy = init_to_value(values=initial_values)
nuts_kernel = NUTS(SOGP,
adapt_step_size=True,
# step_size=0.05,
max_tree_depth=10,
target_accept_prob=0.80,
init_strategy=strategy)

as an initization for the priors. However, this seems not working. When I get samples from the mcmc, the samples never starts from the given initial value. Even when I set num_warmup = 0, the samples still never starts from the given initial value. Can I ask if I did anything wrong?

fehiepsi · July 20, 2024, 7:43pm

Are the init samples in the constrained domain? You can add some debug statement in the model to inspect things. maybe via something like jax.debug

Megumi · July 20, 2024, 8:15pm

Thanks for your reply. I tried jax.debug.print to keep track of the b1s for example. b1s = numpyro.sample(‘b1s’, dist.Normal(0, 5))
b1a = numpyro.sample(‘b1a’, dist.Normal(0, 5))
jax.debug.print(“{b1s}”, b1s=b1s)
jax.debug.print(“{b1a}”, b1a=b1a)

And I got 3 for the first sample. However, in my inti_to_value, I set b1s to be 1. Could I ask how can I further debug this?

Thank you so much again!

fehiepsi · July 21, 2024, 11:15am

Hmm, ‘b1s’: jnp.float64(b1s_true + 2), would this be the reason?