Ordering of variables affects learning in AR model

fehiepsi · April 6, 2020, 4:15am

@nikolasthuesen I run your code locally but found that MCMC is not mixing with any order of z_prev and noises (r_hat is pretty large). I would recommend moving those y_mis* sites to the prediction (not inference) phase. To my knowledge, given a model p(y | x, theta), we use Bayesian imputation to impute missing values of x (i.e. we will define prior for x_mis and run MCMC to learn p(x_mis, theta | x_nomis, y). If we have missing observation, like your case, then after geting p(theta | x, y) using MCMC, you can generate prediction: p(y_mis | theta, x). FYI, if I remove y_mis1 and y_mis2, I see that MCMC converges.

This also applies for forecasting. We should not do forecasting and infer latent variables at the same time. There is no objective for those forecasting results, so that MCMC can learn something. Basically, if you put prior to x, without observation, MCMC will return the same prior for you.

def model():
    x = sample("x", dist.Normal(0, 1))

In your model, y_mis1 is simply

y_mis1 = z_collection[ix_mis1,0] + sigma * numpyro.sample('y_mis1_noise', dist.Normal(0, 1))

and MCMC will return the same prior Normal(0, 1) for you. Without reparameterization like that, it is pretty tricky for MCMC to do both jobs:

learn something depends on other latent variables
infer unnecessary-to-infer variables

Solution for the first issue is to do reparameterization as above. For the second issue, just simply move such inference to prediction step.