Using pyro Inverse AutoRegressive Flow in HMC inference

Hi @stefanwebb, I came across a few discussions on IAFs, but haven’t found examples / tutorials on the usage, thus I have a question on how to use IAFs. In particular, in order to speed up the things, I am trying to use IAFs to make inference with Hamiltonian Monte Carlo for stochastic volatility model (similar to your github link), however, I am not 100% sure it is being applied correctly. Below is a toy-example. Do I understand correctly that this is how IAFs should be applied when defining the model:

without IAFs:

def model(tseries):
    sigma =  pyro.sample('sigma', dist.InverseGamma(2.5, 0.025))
    y = pyro.sample('y',  dist.Normal(0., sigma)), obs=tseries)
    return y

with IAFs:

def model(tseries):
    transform_sigma = dist.transforms.AffineAutoregressive(AutoRegressiveNN(1, [100]))
    base_dist_sigma = dist.InverseGamma(torch.Tensor(np.repeat(2.5, 1)), 
                                        torch.Tensor(np.repeat(0.025, 1)))
    flow_dist_sigma = dist.TransformedDistribution(base_dist_sigma, transform_sigma)

    sigma = pyro.sample("sigma", flow_dist_sigma)
    y = pyro.sample('y',  dist.Normal(0., sigma)), obs=tseries)
    return y

Will be very grateful for your assistance.
Will be happy to share the code if necessary.

Arthur

Hi @ameshkovskiy, sorry for the late reply! I’ll do my best to answer your questions.

I can see a few errors in your code:

  1. You need to do pyro.module("my_transform", transform_sigma) somewhere so that the parameters of the AffineAutoregressive transform are registered with the Pyro parameter store. This let’s Pyro know to include them in the optimizer when learning an objective function.

  2. AffineAutoregressive isn’t so useful when you have a single input. The AutoregressiveNN is in a sense ignored, and the transformed distribution will output sigma = scale * base_sample + mu for scalar (scale, mu)

  3. For this reason, depending on the initialization of AffineAutoregressive and/or what happens during optimization (once you’ve registered the module with Pyro), the sample sigma could become negative, which would then cause the problem to fail! You may want to pass it through a transformation to make it positive again (softplus?)

This model looks like it is using empirical Bayes… is this intended?

Hi @stefanwebb, thanks a lot for your response and sorry for not coming straightaway – notification appeared to be in Spam.

  1. Could you be more precise in which form pyro.module("my_transform", transform_sigma) should be used?

  2. – 3. I have used this example only for the illustrational purpose in order to understand if the overall idea is correct. Sorry for being not precise. The more detailed and real specification of the model looks as follows:

     import pyro.distributions as dist
     from pyro.nn import AutoRegressiveNN
    
     transform_phi = dist.transforms.AffineAutoregressive(AutoRegressiveNN(1, [100]))
     base_dist_phi = dist.Beta(torch.Tensor(np.repeat(20., 1)), 
                               torch.Tensor(np.repeat(1.5, 1)))
     flow_dist_phi = dist.TransformedDistribution(base_dist_phi, transform_phi)
    
     transform_sigma = dist.transforms.AffineAutoregressive(AutoRegressiveNN(1, [100]))
     base_dist_sigma = dist.InverseGamma(2.5, 1)), 
                                         torch.Tensor(np.repeat(0.025, 1)))
     flow_dist_sigma = dist.TransformedDistribution(base_dist_sigma, transform_sigma)
    
     transform_mu = dist.transforms.AffineAutoregressive(AutoRegressiveNN(1, [100]))
     base_dist_mu = dist.Normal(torch.Tensor(np.repeat(0., 1)), 
                                torch.Tensor(np.repeat(10., 1)))
     flow_dist_mu = dist.TransformedDistribution(base_dist_mu, transform_mu)
    
     # define model
     def model_nn(returns):
         
         phi = pyro.sample("phi", flow_dist_phi)
         phi = 2*phi - 1
    
         sigma2 = pyro.sample("sigma2", flow_dist_sigma)
    
         mu = pyro.sample("mu", flow_dist_mu)
    
         h = torch.empty(len(returns))
    
         for t in pyro.poutine.markov(range(len(returns))):
             if t == 0:
                 h[t] = pyro.sample(f'h_{t}', dist.Normal(mu, sigma2**0.5).to_event(0))
             else:
                 h[t] = pyro.sample(f'h_{t}', dist.Normal(mu + phi * (h[t-1] - mu), sigma2**0.5).to_event(0))
    
         y = pyro.sample('y', dist.Normal(0., (h / 2.).exp()), obs=returns)
    
         return y
    

I don’t quite understand if this is the proper way of using IAFs for inference.
I have just come across these notes by Bogdan Mazoure, where he uses SVI() for application of NFs.

Will be very grateful if you could advice on which way would be the correct one. Do you happen to have any examples of application of IAFs for the similar task (i.e. to the inference of stochastic volatility model)?

Any comments?

I have investigated it further. Looks like there is another option:
(1) train the guide using SVI inference,
(2) use Neural Transport reparameterization (NeuTraReparam), as described here.

However, when I try to do so step by step as suggested in the example, I get the following error:

AssertionError: NeuTraReparam does not support observe statements

Here comes the definition of the model(), and the main() function which I call with tseries being an array of returns:

def model(returns):
    # init parameters
    phi = pyro.sample("phi", dist.Beta(20, 1.5))
    phi = 2 * phi - 1
    sigma2 = pyro.sample('sigma2', dist.InverseGamma(2.5, 0.025))
    mu = pyro.sample("mu", dist.Normal(0, 10))

    h = torch.empty(len(returns))

    for t in pyro.poutine.markov(range(len(returns))):
        if t == 0:
            h[t] = pyro.sample(f'h_{t}', dist.Normal(mu, sigma2**0.5).to_event(0))
        else:
            h[t] = pyro.sample(f'h_{t}', dist.Normal(mu + phi * (h[t-1] - mu), sigma2**0.5).to_event(0))

    y = pyro.sample('y', dist.Normal(0., (h / 2.).exp()), obs=returns)

def main(model, tseries):
    guide = AutoIAFNormal(model)
    optimizer = pyro.optim.SGD({"lr": 0.001})
    svi = SVI(model, guide, pyro.optim.SGD({"lr": 0.001}), Trace_ELBO())
    n_steps = 200
    for step in range(n_steps):
        svi.step(torch.Tensor(tseries))
        if step % 100 == 0:
            print(svi.step(torch.Tensor(tseries)))
    neutra = NeuTraReparam(guide)
    model = poutine.reparam(model, config=lambda _: neutra)
    hmc_kernel = HMC(model)
    hmc = MCMC(hmc_kernel,
               num_samples=100,
               warmup_steps=10,
               num_chains=1)

    hmc.run(torch.Tensor(tseries))

    return hmc

From debugging I got that the error is thrown during hmc.run() and that obs are not None for variable 'y'. Will be very grateful if you could advice on what can cause this.

Sorry, if you are not the right person to contact. I am also adding @fehiepsi since I found you in the source code for NeuTraReparam. Sorry for bothering.

Thanks!
Arthur

@ameshkovskiy This sounds like a bug to me. Would you mind creating an issue with a reproducible code so I can take a look?

Thanks, created the issue.