I’m still quite new to Pyro and I have been trying to implement a variety of different models to learn the basics of it but I’ve run into an issue trying to implement a GARCH model. From looking at other implementations online I’ve found one for Stan here. I defined my model to get as close to that implementation as possible and ended up with:
def model(rtn_series):
alpha_0 = pyro.sample("alpha_0", dist.HalfNormal(0.1))
alpha_1 = pyro.sample("alpha_1", dist.Uniform(0,1))
beta_1 = pyro.sample("beta_1", dist.Uniform(0,1))
mu = pyro.sample("mu",dist.Normal(0,0.1))
sigma = torch.tensor(0.001)**2 # initial sigma
for t in range(1,len(rtn_series)):
sigma = torch.sqrt(alpha_0 + alpha_1*(rtn_series[t-1]-mu)**2 + beta_1*sigma**2)
pyro.sample(f'obs_{t}', dist.Normal(mu, sigma), obs=rtn_series[t])
adam_params = {"lr": 0.01}
optimizer = Adam(adam_params)
# setup the inference algorithm
guide = AutoDelta(model)
svi = SVI(model, guide, optimizer, loss=Trace_ELBO())
Now this gives similar to results to Stan if I use MCMC but when I try to use variational techniques it converges to a completely different result. Watching the parameter medians change it’s clear that beta_1
is constantly moving away from the value given by MCMC. I’ve tried it with AutoNormal
, AutoMultivariateNormal
and AutoDelta
and they all exhibit this behaviour. I’ve tried normalising the data, removing the constraint on beta_1
(i.e. by swapping alpha_1
and beta_1
around which didn’t help, and then tried changing dist.Uniform(0,1-alpha_1)
to dist.Uniform(0,1)
) but nothing seems to give anything close to the MCMC result.
I’m wondering if there’s anything obvious I’m missing. I understand that this might be an inefficient way to implement the model but I’m not sure which direction is the best way to go.