Bayesian Multiple Linear Regression: Parameter Constraints

threepwood · November 10, 2022, 7:28pm

Hi numpyro community

I am new to numpyro and also just started my Bayesian career.
So, this forum is really helpful - thanks a lot!

Using the model below, I am trying to attribute partial amounts of total volumes (across different regions) to different customer classes.

Now, I would like to constrain my customer class parameters such that
C1 > C2 > C3 ... > C8 > 0

From reading it seems that parameter constraints may lead to inefficient MCMC sampling.
Also, I have the feeling that a different parameterization of my model might be part of the solution.
Unfortunately however, I was not able to find a valid solution, yet.
So any help is much appreciated!

# customer class labels
reg_labels = ["C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8"]

def model(X, volume):              
    betas = []
    for reg_label in reg_labels:
        betas.append(numpyro.sample(f"b{reg_label}", dist.Uniform(0, 20000)))
    
    sigma = numpyro.sample("sigma", dist.Exponential(1.0))
    mu = jnp.dot(X, jnp.array(betas))
    
    numpyro.sample("obs", dist.Normal(mu, sigma), obs=volume)

PS: Any additional improvement hints regarding my basic simple bayesian model are much appreciated, too!

Elchorro · November 10, 2022, 10:59pm

Your question is not fully clear to me. Are you trying to ensure that betas are an ordered vector? If so, you probably need to think of the parameters over which you sample as the differences between consecutive elements. Bob Carpenter discusses how Stan implements the ordered constraint here here.

Does the upper bound on uniform matter, or is it just meant to be some very-big-number. If the latter, then maybe something like this would work.

def model():
    ...
    with numpyro.plate("pl", len(reg_labels)):
        etas = numpyro.sample("beta", dist.Uniform(0, 2000))
    betas = numpyro.deterministic("betas", jnp.cumsum(etas))
    ....

Otherwise, if the parameters need to be bounded above, then something like this might work:

def model2():
    ...
    betas = [0]
    for i in range(len(reg_labels)):
        eta_i = numpyro.sample(f"e{reg_labels[i]}", dist.Uniform(0, 2000 - betas[i]))
        betas.append(eta_i + betas[i])
    betas = numpyro.deterministic("betas", jnp.array(betas[1:]))
    ....

But notice that the prior will not have uniform marginal distributions anymore since in this case P(C1 > 1/2 K) > P(C1 < 1/2 K), where K is the upper bound.

Let me know if I misunderstood your question.

fehiepsi · November 11, 2022, 12:23pm

You can also use ImproperUniform distribution with positive_ordered_vector support.