# SVI and guide for Hierarchical model - Radon example

Hi,

I’m trying to learn how to write my own `guide()` in the case of hiearchical models. Guides for non-hierarchical models, e.g. the Bayes Regression tutorial makes sense to me, but I am having trouble translating that to a hierarchical setting. Information on how to write a guide for hiearchical models seem sparse, though I did look through the eight schools example.

To learn, I’m trying to reproduce the example shown here: https://twiecki.io/blog/2014/03/17/bayesian-glms-3/
which in essence is trying to predict the radon level in homes based on whether or not a house has a basement. Data originates from different counties and thus it makes sense to learn a hierarchical setting where intercept and slope can vary based on county. I.e.: y_i = α_j[i] + β_j[i]*x_i + ϵ_i, for county j, observation i.

Writing the model (prior) and using MCMC to fit, seems to work quite well and it clearly learns different alphas and betas for each county. I wrote my model as follows:

``````    def model(self, X: pd.DataFrame, y: pd.Series):

floor = torch.from_numpy(X['floor'].to_numpy(dtype=np.float32))
county_idx = X['county_code'].values
num_counties = len(X['county'].unique())

# global hyperprior
mu_a = pyro.sample('mu_a', dist.Normal(0., 1.))
sigma_a = pyro.sample('sigma_a', dist.HalfCauchy(scale=1.))
mu_b = pyro.sample('mu_b', dist.Normal(0., 1.))
sigma_b = pyro.sample('sigma_b', dist.HalfCauchy(scale=1.))

# conditional independence between each county
# alpha, beta will have shape num_counties
with pyro.plate('counties', num_counties):
alpha = pyro.sample('alpha', dist.Normal(mu_a, sigma_a))
beta = pyro.sample('beta', dist.Normal(mu_b, sigma_b))

mean = alpha[county_idx] + beta[county_idx] * floor
eps = pyro.sample('eps', dist.HalfCauchy(1.0))

# conditional independence between each observation
with pyro.plate('data', floor.shape[0]):
obs = pyro.sample('obs', dist.Normal(mean, eps), obs=log_radon)

return obs
``````

The problem arises when I try to use SVI and write my own guide (approximate posterior). I’m unsure if I wrote the guide correctly and was hoping someone could tell me what I’m doing wrong. It seems to converge solely towards an overall group disitribution. I suspect I need to specify more `pyro.param`’s since I only have `pyro.param`’s for the hyperprior distribution. But I’m unsure whether or not pyro does this implicitly?

My guide looks as follows:

``````    def guide(self, X: pd.DataFrame):

floor = torch.from_numpy(X['floor'].to_numpy(dtype=np.float32))
county_idx = X['county_code'].to_numpy()
num_counties = len(X['county'].unique())

# Learnable parameters for hyperpriors
# mean and sd for mean of alpha
mu_mu_a = pyro.param('mu_mu_a', torch.randn(1))
sigma_mu_a = pyro.param('sigma_mu_a', torch.ones(1), constraint=constraints.positive)
# scale for sd of alpha
sigma_sigma_a = pyro.param('sigma_sigma_a', torch.ones(1), constraint=constraints.positive)

# mean and sd for mean of beta
mu_mu_b = pyro.param('mu_mu_b', torch.randn(1))
sigma_mu_b = pyro.param('sigma_mu_b', torch.ones(1), constraint=constraints.positive)
# scale for sd of beta
sigma_sigma_b = pyro.param('sigma_sigma_b', torch.ones(1), constraint=constraints.positive)

# learnable for the noise
eps_scale = pyro.param('eps_scale', torch.ones(1), constraint=constraints.positive)

# sample
mu_a = pyro.sample('mu_a', dist.Normal(mu_mu_a, sigma_mu_a))
sigma_a = pyro.sample('sigma_a', dist.HalfCauchy(sigma_sigma_a))

mu_b = pyro.sample('mu_b', dist.Normal(mu_mu_b, sigma_mu_b))
sigma_b = pyro.sample('sigma_b', dist.HalfCauchy(sigma_sigma_b))

with pyro.plate('counties', num_counties):
alpha = pyro.sample('alpha', dist.Normal(mu_a, sigma_a))
beta = pyro.sample('beta', dist.Normal(mu_b, sigma_b))

mean = alpha[county_idx] + beta[county_idx] * floor

eps = pyro.sample('eps', dist.HalfCauchy(eps_scale))

return mean
``````

I will then fit using SVI:

``````optimizer = optim.Adam({"lr": 1e-2})
svi = SVI(model, guide, optimizer, loss=Trace_Elbo())
for epoch in range(epochs):
svi.step(X, y)
``````

Any feedback on this would be great.

2 Likes

Yes, I too am looking for a neat example for hierarchical coefficients aka random coefficient model (Ideally a few fixed and a few random, or atleast all random). It is pretty straightforward to specify in MCMC3 package. It’d be great if pyro has an example. Many industrial applications…IMHO…