Efficient way to generate conditional guide distribution

Hi all,

I’m trying to think of a way to efficiently implement a model/guide in numpyro that has two vectors sampled from a joint MVN, where the ith entries of each vector are correlated. Namely,

[\beta_1, \beta_2] ~ MVN_2k([0_k, 0_k], [[D_1, I_k r],[I_k r, D_2]]) where D_1, D_2 are diagonal matrices, I_k is the identity, and r is a correlation parameter.

I implemented this in a model by manually doing an independent and conditional sample using the following code

    # so just implement the conditional manually for now
    rho = numpyro.param("rho", 0.) 
    sd_1 = some_func_1() # this is a vector to capture a diagonal covar D_1
    sd_2 = some_func_2() # this is a vector to capture a diagonal covar D_2
    beta_1 = numpyro.sample("beta_1", dist.Normal(0., sd_1))
    beta_2 = numpyro.sample("beta_2", dist.Normal(rho * sd_2 / sd_1 * beta_1, jnp.sqrt(1 - rho**2) * sd_2))

What is less clear to me is how to ensure the posterior guide allows for correlation across the ith entry in beta_1 and beta_2 using this approach. In theory I could create k 2x2 MVNs, but those would not refer to the original parameters of beta_1 and beta_2, which is required to map from model to guide. Any thoughts/help would be greatly appreciated.

Good question! Just to make sure that I understand correctly: you have two “vector-valued” variables beta1, beta2 in the model, and what to construct a MVN guide for them? One way to do is

beta = numpyro.sample("beta", MVN, infer={"is_auxiliary": True})
beta1 = numpyro.sample("beta1", dist.Delta(beta[:k]))
beta2 = numpyro.sample("beta2", dist.Delta(beta[k:]))

If k is large, you might keep your code but use MVN conditional distribution for beta2. That is: using diagonal normal for beta1 and multivariate normal for beta2.

Thanks @fehiepsi for the speedy response, as always.

Yes, in my case k is quite large and doing a a full MVN sample and splitting won’t be possible (but this is a cool trick to know for future work).

Manually implementing the conditional posterior guide is still unclear to me. In the posterior is it possible to manually implement it (as you suggest), but still grab the model parameter rho? Specifically,

beta_1  = numpyro.sample("beta_1", dist.Normal(beta_1_loc, beta_1_sd)
beta_2 = numpyro.sample("beta_2", dist.Normal(rho * beta_2_sd / beta_1_sd * beta_1, jnp.sqrt(1 - rho**2) * beta_2_sd))

But here rho should refer to model/likelihood parameters and not a variational parameter.

EDIT: In hindsight-- rho here most likely -should- be a variational parameter, rather than forcing to use the prior correlation. Is this correct (have I been sitting on my own answer this entire time–just re-use what I did in the prior)?

grab the model parameter rho

If you want to use the same parameter, just simply use numpyro.param with the same name in the guide.

rho here most likely -should- be a variational parameter, rather than forcing to use the prior correlation

It depends on your model, I guess either way should be fine. I have seen many MCMC models that learn the correlation (using LKJ distribution) though I haven’t seen a “constant” correlation like this. The point of variational approximation is to make a sophiticated/amortized guide for your model so it also makes sense to use diagonal normal in the model and mvn in the guide.