In HMM with SVI, numpyro doesn't find true params, pyro does

mishavanbeek · May 19, 2022, 7:32pm

Sorry I should have specified the discrepancy. It’s the difference between finding something like phi=0.88 (close to true value of 0.9), and something like 0.6. Your explanation makes a lot of sense, many thanks! I understand that it’s indeed possible to calculate the conditional log_prob for a Gaussian state space model, but not in general.

I tried to switch to MCMC and indeed now I can find the true value. I have also included the additional complexity, primarily a censored student-t observation. Censoring works fine for a normal distribution, but the switch to a student-t somehow pushes the phi to zero and gives me a very noisy process, even when I choose df=1000 (so effectively I have a Normal distribution). Is the switch causing the algorithms to make different choices under the hood? I would expect that since the distributions are numerically pretty much identical for df=1000, there should be no difference.

My code is below for reference.

def model(y_obs, idx_y):
    
    N, L = y_obs.shape
    
    phi = numpyro.sample('phi', dist.Uniform(jnp.array(-1.0), jnp.array(1.0)))
    sigma_x = numpyro.sample('sigma_x', dist.HalfCauchy(jnp.array(1.0)))
    sigma_y = numpyro.sample('sigma_y', dist.HalfCauchy(jnp.array(1.0)))
    
    x_0 = numpyro.sample('x_0', dist.Normal(jnp.zeros(1), sigma_x))
    eps = numpyro.sample('eps', dist.Normal(jnp.zeros(T), sigma_x))
    
    def transition(x, e):
        x_new = x * phi + e
        return x_new, x_new
    
    _, x = jax.lax.scan(transition, x_0, eps)
    numpyro.deterministic('x', x)
    
    x_cmn = jnp.take_along_axis(x.T, idx_y, 1)
    
    y_lat = dist.Normal(x_cmn, sigma_y)
    #y_lat = dist.StudentT(df=2, loc=x_cmn, scale=sigma_y)
    
    with numpyro.handlers.mask(mask=y_obs > 0):
        numpyro.sample('obs', y_lat, obs=y_obs)
        
    with numpyro.handlers.mask(mask=y_obs == 0):
        numpyro.sample('trunc_label', dist.Bernoulli(1 - y_lat.cdf(jnp.nan_to_num(y_obs, 0.0))), obs=y_obs)