I decided to try out Pyro/Numpyro to test the latest normalizing flow algorithms on my astrophysics model. The model is simple, consisting of only 7 parameters, but the posterior is usually multimodal (in a predictable way).
Here’s an outline of the model
def model(t, F, Ferr): ln_DeltaF = numpyro.sample("ln_DeltaF", dist.Normal(4., 4.)) DeltaF = jnp.exp(ln_DeltaF) ln_Fbase = numpyro.sample("ln_Fbase", dist.Normal(2., 4.)) Fbase = jnp.exp(ln_Fbase) t0 = numpyro.sample("t0", dist.Normal(ca.utils.estimate_t0(event), 20.)) ln_tE = numpyro.sample("ln_tE", dist.Normal(3., 6.)) tE = jnp.exp(ln_tE) u0 = numpyro.sample("u0", dist.Normal(0., 1.)) piEE = numpyro.sample("piEE", dist.Normal(0., .5)) piEN = numpyro.sample("piEN", dist.Normal(0., .5)) # Trajectory zeta_e_t = jnp.interp(t, points, zeta_e) zeta_n_t = jnp.interp(t, points, zeta_n) zeta_e_t0 = jnp.interp(t0, points, zeta_e) zeta_n_t0 = jnp.interp(t0, points, zeta_n) zeta_e_dot_t0 = jnp.interp(t0, points, zeta_e_dot) zeta_n_dot_t0 = jnp.interp(t0, points, zeta_n_dot) delta_zeta_e = zeta_e_t - zeta_e_t0 - (t - t0) * zeta_e_dot_t0 delta_zeta_n = zeta_n_t - zeta_n_t0 - (t - t0) * zeta_n_dot_t0 u_per = u0 + piEN * delta_zeta_e - piEE * delta_zeta_n u_par = ( (t - t0) / tE + piEE * delta_zeta_e + piEN * delta_zeta_n ) u = jnp.sqrt(u_per**2 + u_par**2) # Magnification A_u = (u ** 2 + 2) / (u * jnp.sqrt(u ** 2 + 4)) A_u0 = (u0 ** 2 + 2) / (jnp.abs(u0) * jnp.sqrt(u0 ** 2 + 4)) A = (A_u - 1) / (A_u0 - 1) # Predicted flux F_pred = DeltaF*A + Fbase ln_c = numpyro.sample("ln_c", dist.Exponential(1/2.)) numpyro.sample("data_dist", dist.Normal(F_pred, jnp.exp(ln_c) * Ferr), obs=F)
When I sample the model with NUTS the sampler finds only one of the modes because the posterior is multimodal in the u_0 parameter. Here’s what the likelihood surface looks like when evaluated on a (u0, piEN) grid while keeping all the other parameters fixed:
Ideally, I’d like to be able to train a guide which finds both modes and pass the guide to HMC for use in NeuTra HMC.
Based on this example, I thought the
AutoBNAFNormal autoguide might be able to catch both modes, however, after testing various combinations of input parameters I find that it always gets stuck in one of the modes. Even when I fix all of the parameters and fit for the parameters in a 2D subspace of the full parameter space, it fails to find both modes except for a few very particular combinations of the learning rate and input parameters to
My questions are the following:
- Is there a better choice for a guide which would work well for this kind of problem?
- I’ve read a paper tackling a similar problem where they used a 20 block MAF and a mixture of Gaussians for the base distribution of the flow and they seem to be able to recover multiple modes. How can I implement this in Pyro? Can I change the base distribution for
AutoBNAFNormal? I’ve read the docs on constructing custom guides but I don’t really know where to start.