Add sample from a known distribution or "nuisance" samples

renecotyfanboy · March 4, 2024, 4:45pm

Hi everyone,
I would like to sample from a known distribution in my model, and be sure that distribution is not accounted for when running the MCMC. At this time, when calling numpyro.sample('Noise', dist.Normal(0, 1)) and adding it to the observed value, the Noise parameter will be constrained like the others. I would like to know if there is a clean way to make sure this parameter stay unchanged with the MCMC

HughMcDougallAstro · March 4, 2024, 11:34pm

I’m unclear what you mean. You have some sort of additional noise as a part of your observations, and you want to remove those outliers? Could you express this graphically or in equation form?

renecotyfanboy · March 5, 2024, 8:18am

Sorry if I am not clear enough, I really lack the words to express my issue in a Bayesian framework, I’ll do my best, let’s steal you’re introduction tutorial (neat job btw):

Just take the case of Y ~ m*X + c

def model(X,Y,E):  
    m = numpyro.sample('m', numpyro.distributions.Uniform(-5,5)) # prior on m  
    c = numpyro.sample('c', numpyro.distributions.Uniform(-5,5)) # Prior on c  
  
    y_model = m*X + c  
      
    for i in range(len(X)):  
        numpyro.sample('y_%i' %i, numpyro.distributions.Normal(y_model[i], E[i]), obs=Y[i])

My specific situation would be that I know the distribution of c, let’s say c ~ N(0, 3.5). But if I declare c = numpyro.sample('c', numpyro.distributions.Normal(0, 3.5)) in the model, it will be treated as a prior distribution, and fitted as the sampling is progressing. I would like it to avoid this, so the distribution of c after sampling is such that c ~ N(0, 3.5). Is it clearer ?

HughMcDougallAstro · March 6, 2024, 3:39am

Ah okay, I think I understand. You want your samples for c to obey the prior distribution instead of the posterior distribution. There’s two ways I can think of.

Firstly, just tweak your results such that the parameter is re-drawn from the prior, e.g.:

Manually Overwrite Results

samps = sampler.get_samples()
N = len(samps['c'])
samps['c'] = np.random.randn(N) * 3.5

The other approach is to use the effect handlers, a suite of tools for editing the behavior of models from the outside. I think the handler you want is “do”, but having not tested it you might want “lift”. Handlers are called like so:

Using effect handlers

from numpyro import handlers
from numpyro import distributions as dist
altered_model = handlers.do(model, data = {"c", dist.Normal(0,3.5)})
samps = sampler.get_samples()
N = len(samps['c'])
samps['c'] = np.random.randn(N) * 3.5

Apologies if this example doesn’t work perfectly, I didn’t get a chance to test it myself. Let me know how it goes.

renecotyfanboy · March 6, 2024, 10:46am

Hi Hugh,

Thank again for your feedbacks, for instance, if I simply over-write the posterior distributions, I would not be representative of what the sampler has explored during its run. I didn’t know of effect handlers and how to use them, thank you for pointing this to me! It could have work if the site values could be fixed stochastically with a distribution (the do handler set the prior distribution for a parameter from what I understood, and the condition set a site to a given value).

If found a way around in this thread where the definition of a custom sampling for each site is achievable using HMCGibbs. And this seems to work!