Add additional level in hierarchical model

mattiasthalen · June 20, 2021, 8:26am

Hi there, first post here
I’ve previously been using PyMC3, but wanted to give numpyro a go.

I’ve looked at this example for hierarchical modeling:
http://num.pyro.ai/en/stable/tutorials/bayesian_hierarchical_linear_regression.html

But what I want to do is to add another layer, could be seen as something like country in the above example. But how would that look in regards to the plates and the formula: alpha[patient_id, country_id] + beta[patient_id, country_id]*Weeks

fehiepsi · June 20, 2021, 2:23pm

Hi @mattiasthalen, welcome! I think if we have country level, the code would be something like

    with numpyro.plate("patients", n_patients, dim=-2):
        with numpyro.plate("countries", n_countries, dim=-1):
            α = numpyro.sample("α", dist.Normal(μ_α, σ_α))
            β = numpyro.sample("β", dist.Normal(μ_β, σ_β))

    FVC_est = α[PatientID, country_id] + β[PatientID, country_id] * Weeks

mattiasthalen · June 20, 2021, 4:16pm

Interesting!
Is there no need for intermediate parameters, is only the global & local parameters needed?

In PyMC3 I would have something like:
global_alpha_mu -> country_alpha_mu -> alpha

fehiepsi · June 22, 2021, 4:57am

The way we construct a model corresponds 1-1 to a graphical model. So if there is an “intermediate” variable in your graphical model, then you will need to define it in (num)pyro. If you want to record any deterministic value, you can use numpyro.deterministic. Could you post your simplified pymc3 model here so I can get better understanding of what you are looking for?

mattiasthalen · June 29, 2021, 8:11am

Thanks for the help!

I made it work in the end

I was after this:

alpha = numpyro.sample('alpha', alpha_prior)
beta0 = numpyro.sample('beta0', beta0_prior)
beta1 = numpyro.sample('beta1', beta1_prior)
sigma = numpyro.sample('sigma', sigma_prior)

alpha_sigma = numpyro.sample('alpha_sigma', numpyro.distributions.HalfNormal(1))
beta0_sigma = numpyro.sample('beta0_sigma', numpyro.distributions.HalfNormal(1))
beta1_sigma = numpyro.sample('beta1_sigma', numpyro.distributions.HalfNormal(1))

with numpyro.plate('level1', n_level1_ids, dim = -1):
    level1_alpha_offset = numpyro.sample('level1_alpha_offset', numpyro.distributions.Normal(0, 1))
    level1_alpha = numpyro.deterministic('level1_alpha', alpha + level1_alpha_offset*alpha_sigma)

    level1_beta0_offset = numpyro.sample('level1_beta0_offset', numpyro.distributions.Normal(0, 1))
    level1_beta0 = numpyro.deterministic('level1_beta0', beta0 + level1_beta0_offset*beta0_sigma)
            
    level1_beta1_offset = numpyro.sample('level1_beta1_offset', numpyro.distributions.Normal(0, 1))
    level1_beta1 = numpyro.deterministic('level1_beta1', beta1 + level1_beta1_offset*beta1_sigma)

    level1_alpha_sigma = numpyro.sample('level1_alpha_sigma', numpyro.distributions.HalfNormal(1))
    level1_beta0_sigma = numpyro.sample('level1_beta0_sigma', numpyro.distributions.HalfNormal(1))
    level1_beta1_sigma = numpyro.sample('level1_beta1_sigma', numpyro.distributions.HalfNormal(1))

    with numpyro.plate('level2', n_level2_ids, dim = -2):
        level2_alpha_offset = numpyro.sample('level2_alpha_offset', numpyro.distributions.Normal(0, 1))
        level2_alpha = numpyro.deterministic('level2_alpha', level1_alpha + level2_alpha_offset*level1_alpha_sigma)

        level2_beta0_offset = numpyro.sample('level2_beta0_offset', numpyro.distributions.Normal(0, 1))
        level2_beta0 = numpyro.deterministic('level2_beta0', level1_beta0 + level2_beta0_offset*level1_beta0_sigma)
                    
        level2_beta1_offset = numpyro.sample('level2_beta1_offset', numpyro.distributions.Normal(0, 1))
        level2_beta1 = numpyro.deterministic('level2_beta1', level1_beta1 + level2_beta0_offset*level1_beta1_sigma)

Full model function:

def quadratic_model(x,
                    y = None,
                    level1_id = None,                    
                    level2_id = None,
                    level2_beta = False,
                    inverse = False,
                    standardize = True,
                    priors = None,
                    **kwargs):
    '''
    Bayesian model to estimate the parameters used in the quadratic regression `y = alpha + beta0*x + beta1*x^2`.
    Up to two additional levels are supported to make a hierarchical model.

    Parameters
    ----------
    x : list of float
        The independent variable.
    y : list of float, optional
        Dependent variable.
    level1_id : list of int, optional
        Optional list of integers for additional level in the model.
    level2_id : list of int, optional
        Optional list of integers for additional level in the model.
    level2_name : str, default `level2`
        Set the display name of level 2.
    level2_beta : bool, default False
        Set to `True` if `beta0` & `beta1` should be inferred for level 2.
    inverse : bool, default False
        Flip the independent & dependent variables.
    standardize : bool, default True
        Standardize variables using `sklearn.preprocessing.StandardScaler()`.
        Unscaled parameters are returned in addition to the scaled ones.
    priors : dict, default None
        Provide a dict of priors for the parameters.
        E.g. `{'alpha': numpyro.distributions.Normal(0, 1)}`.
    '''

    x_obs, y_obs = [x, y] if not inverse else [y, x]

    # Standardize inputs
    if standardize:
        xy_scaler = sklearn.preprocessing.StandardScaler()
        xy = np.array(list(zip(x_obs, y_obs)))
        xy_scaled = xy_scaler.fit_transform(xy)
        x_obs, y_obs = np.array(list(zip(*xy_scaled)))
        x_mean, y_mean = xy_scaler.mean_
        x_sd, y_sd = xy_scaler.scale_

        inverse_alpha = lambda alpha, beta0, beta1: y_sd*(alpha - beta0*x_mean/x_sd + beta1*x_mean**2/x_sd**2) + y_mean
        inverse_beta0 = lambda beta0, beta1: beta0*y_sd/x_sd - 2*beta1*x_mean*y_sd/x_sd**2
        inverse_beta1 = lambda beta1: beta1*y_sd/x_sd**2

    level1_check = type(level1_id) is np.ndarray
    level2_check = type(level2_id) is np.ndarray

    # Hyperparameters
    alpha_prior = numpyro.distributions.Normal(0, 1)
    beta0_prior = numpyro.distributions.Normal(0, 1)
    beta1_prior = numpyro.distributions.Normal(0, 1)
    sigma_prior = numpyro.distributions.HalfNormal(1)

    if type(priors) is dict:
        alpha_prior = priors.get('alpha', alpha_prior)
        beta0_prior = priors.get('beta0', beta0_prior)
        beta1_prior = priors.get('beta1', beta1_prior)
        sigma_prior = priors.get('sigma', sigma_prior)

    alpha = numpyro.sample('alpha', alpha_prior)
    beta0 = numpyro.sample('beta0', beta0_prior)
    beta1 = numpyro.sample('beta1', beta1_prior)
    sigma = numpyro.sample('sigma', sigma_prior)

    if standardize:
        numpyro.deterministic('alpha_unscaled', inverse_alpha(alpha, beta0, beta1))
        numpyro.deterministic('beta0_unscaled', inverse_beta0(beta0, beta1))
        numpyro.deterministic('beta1_unscaled', inverse_beta1(beta1))
    
    # Level 1 Parameters
    if level1_check:
        n_level1_ids = len(np.unique(level1_id))

        alpha_sigma = numpyro.sample('alpha_sigma', numpyro.distributions.HalfNormal(1))
        beta0_sigma = numpyro.sample('beta0_sigma', numpyro.distributions.HalfNormal(1))
        beta1_sigma = numpyro.sample('beta1_sigma', numpyro.distributions.HalfNormal(1))

        with numpyro.plate('level1', n_level1_ids, dim = -1):
            level1_alpha_offset = numpyro.sample('level1_alpha_offset', numpyro.distributions.Normal(0, 1))
            level1_alpha = numpyro.deterministic('level1_alpha', alpha + level1_alpha_offset*alpha_sigma)

            level1_beta0_offset = numpyro.sample('level1_beta0_offset', numpyro.distributions.Normal(0, 1))
            level1_beta0 = numpyro.deterministic('level1_beta0', beta0 + level1_beta0_offset*beta0_sigma)
            
            level1_beta1_offset = numpyro.sample('level1_beta1_offset', numpyro.distributions.Normal(0, 1))
            level1_beta1 = numpyro.deterministic('level1_beta1', beta1 + level1_beta1_offset*beta1_sigma)

            if standardize:
                numpyro.deterministic('level1_alpha_unscaled', inverse_alpha(level1_alpha, level1_beta0, level1_beta1))
                numpyro.deterministic('level1_beta0_unscaled', inverse_beta0(level1_beta0, level1_beta1))
                numpyro.deterministic('level1_beta1_unscaled', inverse_beta1(level1_beta1))

            # Level 2 Parameters
            if level2_check:
                n_level2_ids = len(np.unique(level2_id))

                level1_alpha_sigma = numpyro.sample('level1_alpha_sigma', numpyro.distributions.HalfNormal(1))
                
                if level2_beta:
                    level1_beta0_sigma = numpyro.sample('level1_beta0_sigma', numpyro.distributions.HalfNormal(1))
                    level1_beta1_sigma = numpyro.sample('level1_beta1_sigma', numpyro.distributions.HalfNormal(1))

                with numpyro.plate('level2', n_level2_ids, dim = -2):
                    level2_alpha_offset = numpyro.sample('level2_alpha_offset', numpyro.distributions.Normal(0, 1))
                    level2_alpha = numpyro.deterministic('level2_alpha', level1_alpha + level2_alpha_offset*level1_alpha_sigma)

                    if level2_beta:
                        level2_beta0_offset = numpyro.sample('level2_beta0_offset', numpyro.distributions.Normal(0, 1))
                        level2_beta0 = numpyro.deterministic('level2_beta0', level1_beta0 + level2_beta0_offset*level1_beta0_sigma)
                    
                        level2_beta1_offset = numpyro.sample('level2_beta1_offset', numpyro.distributions.Normal(0, 1))
                        level2_beta1 = numpyro.deterministic('level2_beta1', level1_beta1 + level2_beta0_offset*level1_beta1_sigma)

                    if standardize:
                        numpyro.deterministic('level2_alpha_unscaled', inverse_alpha(level2_alpha, level1_beta0, level1_beta1))
                      
                    if (standardize) & (level2_beta):
                        numpyro.deterministic('level2_beta0_unscaled', inverse_beta0(level2_beta0, level2_beta1))
                        numpyro.deterministic('level2_beta1_unscaled', inverse_beta1(level2_beta1))
    
    # Update parameter assignments
    if level1_check:
        alpha = level1_alpha[level1_id]
        beta0 = level1_beta0[level1_id]
        beta1 = level1_beta1[level1_id]
    
    if (level1_check) & (level2_check):
        alpha = level2_alpha[level2_id, level1_id]

    if (level1_check) & (level2_check) & (level2_beta):
        beta0 = level2_beta0[level2_id, level1_id]
        beta1 = level2_beta1[level2_id, level1_id]

    # Likelihood
    mu = alpha + beta0*x_obs + beta1*x_obs**2

    with numpyro.plate('observations', len(x_obs)):
        numpyro.sample('y_obs', numpyro.distributions.Normal(mu, sigma), obs = y_obs)

cstoafer · February 5, 2022, 1:20am

I’m struggling with the same issue. My code runs (with help from this post), but does not produce my intended model. I am trying to build a nested hierarchical model where there is a hierarchy of global → category → title (each title belongs to one category; each category can contain multiple titles).

Instead of following that nested structure, my model generates a parameter for every category, title combination. How can I represent the intended structure?

def model(data, target=None):
    mu_b_global = numpyro.sample('mu_b_global', dist.Normal(0., 5))
    sigma_b_global = numpyro.sample('sigma_b_global', dist.HalfNormal(2))

    with numpyro.plate('plate_categories', category_count, dim=-2):
        mu_b_category = numpyro.sample('mu_b_category', dist.Normal(mu_b_global, sigma_b_global))
        sigma_b_category = numpyro.sample('sigma_b_category', dist.HalfNormal(2))
        with numpyro.plate('plate_titles', category_title_count, dim=-1):
            mu_b_title = numpyro.sample('mu_b_title', dist.Normal(mu_b_category, sigma_b_category))

    est = mu_b_title[category_title_codes.values, category_codes.values]
    sigma = numpyro.sample('sigma', dist.HalfNormal(5))
    with numpyro.plate("data", data.shape[0]):
        numpyro.sample('obs', dist.Normal(est, sigma), obs=target)

cstoafer · February 5, 2022, 1:44am

Here is the simplified model in pymc3:

gb = data.groupby(['title_code', 'category_code']).size().reset_index()
title_lookup = gb.drop_duplicates(subset=['title_code'], keep='first')
title_to_category_idx = title_lookup['category_code']
title_idx = data['title_code'].values
with pm.Model() as model:
    mu_b_global = pm.Normal('mu_b_global', mu=0., sigma=5)
    sigma_b_global = pm.HalfNormal('sigma_b_global', 2)
    mu_b_category = pm.Normal('mu_b_category',
                              mu=mu_b_global,
                              sigma=sigma_b_global,
                              shape=category_count)
    sigma_b_category = pm.HalfNormal('sigma_b_category', 2,
                                     shape=category_count)
    sigma_b_title = sigma_b_category[title_to_category_idx]
    mu_b_title = pm.Normal('mu_b_title',
                           mu=mu_b_category[title_to_category_idx],
                           sd=sigma_b_title,
                           shape=category_title_count)
    
    eps = pm.HalfCauchy('eps', 5.)
    est = mu_b_title[title_idx]
    likelihood = pm.Normal('likelihood', mu=est,
                           sigma=eps, observed=target)

cstoafer · February 5, 2022, 5:47am

I figured it out by reviewing documentation about broadcasting and previous forum posts.

title_to_category_idx = data.groupby('title_code')['category_code'].first()
def model(data, target=None):
    mu_b_global = numpyro.sample('mu_b_global', dist.Normal(0., 5))
    sigma_b_global = numpyro.sample('sigma_b_global', dist.HalfNormal(2))

    with numpyro.plate('plate_categories', category_count):
        mu_b_category = numpyro.sample('mu_b_category', dist.Normal(mu_b_global, sigma_b_global))
        sigma_b_category = numpyro.sample('sigma_b_category', dist.HalfNormal(2))

    with numpyro.plate('plate_titles', category_title_count):
        mu_b_title = numpyro.sample('mu_b_title', dist.Normal(mu_b_category[title_to_category_idx.values], sigma_b_category[title_to_category_idx.values]))

    est = mu_b_title[data.title_code.values]
    sigma = numpyro.sample('sigma', dist.HalfNormal(5))
    with numpyro.plate("data", data.shape[0]):
        numpyro.sample('obs', dist.Normal(est, sigma), obs=target)

jpblazick · October 28, 2022, 9:58pm

I’m still having issues with this. Getting a broadcast error on the nested mu_b_title line. Any ideas?

def model(gl, gvs, gv_count, order_count, sc, hb_class, dfu_class, n_gls, n_scc, N):
    mu_gl = numpyro.sample("mu_gl", dist.Normal(-2, 1.0))
    sigma_gl = numpyro.sample("sigma_gl", dist.HalfNormal(1.0))
    mu_gvs = numpyro.sample("mu_gvs", dist.Normal(-.4, .1))
    sigma_gvs = numpyro.sample("sigma_gvs", dist.HalfNormal(1.0))
    sigma_subcat_gvs = numpyro.sample('sigma_subcat_gvs', dist.HalfNormal(1.0))

    with numpyro.plate("gl_i", n_gls):
        alpha_gl = numpyro.sample("alpha_gl", dist.Normal(mu_gl, sigma_gl))
        beta_gvs_gl = numpyro.sample("beta_gvs_gl", dist.Normal(mu_gvs, sigma_gvs))
        sigma_gvs_gl = numpyro.sample('sigma_gvs_gl', dist.HalfNormal(2.))
    
    with numpyro.plate("subcat_i", n_scc):
        beta_subcat_gvs = numpyro.sample('beta_subcat_gvs', dist.TruncatedNormal(beta_gvs_gl[gl], sigma_gvs_gl[gl], high=-.001))
    
    beta_hb = numpyro.sample('beta_hb', dist.Normal(0, 5))

    p = sigmoid(alpha_gl[gl] + beta_subcat_gvs[sc]*gvs + beta_hb*hb_class)
    numpyro.sample('probs', dist.Delta(p), obs=p)
    with numpyro.plate("data", N):
        numpyro.sample("obs", dist.Binomial(gv_count, probs = p), obs = order_count)```

cstoafer · October 28, 2022, 10:31pm

How is gl defined? I think the issue is with beta_gvs_gl[gl] (same for sigma_gvs_gl[gl]) where gl is used to map a subcat to a gl.

In my example there are 100 titles that map to 10 categories. I used mu_b_category[title_to_category_idx.values] to map title means to category means, where title_to_category_idx.values is an array of length 100. Each value is between 0 and 9 to map to the 10 categories. You’ll need to make sure gl is structured similarly.

I hope that helps; it’s been a while since I’ve worked with my code.