CAVI + ADVI mixed approach

I have a model that has a bunch of latent variables, say \theta_1, \dotsc, \theta_L, that have conjugate posteriors conditional on some other latent variable, say z. If I fix this latent variable z then CAVI works really well and scales fine, however there are no known conjugacy relations I can leverage when I want to infer the approximate posterior for q(z). I’m left with an unwieldy, but appropriately derived \mathbb{E}_{-z}[\log q(z)] up to the normalizing constant.

Is it possible to alternate between CAVI-style updates within numpyro for \theta_1, \dotsc, \theta_L and then use ADVI for z? I’d like to avoid doing a full ADVI approach due to many discrete latent variables in \theta_1, \dotsc, \theta_L.

My first thought was to compute the CAVI updates within the model function, and only define a guide for z, however the ELBO computation would be incorrect by failing to include terms related \theta_1, \dotsc, \theta_L. I’m curious to what the community’s thoughts are on this.

Other than this approach, alternatives would be to use something like Laplace or Delta approximation variational inference to handle the non-conjugacy, but that would fall outside of numpyro in terms of implementation.

1 Like