Is there a way of approximating a categorical to a continuous variable?

anmunozj · February 24, 2024, 7:45pm

Hello friends. I have a question. I would like to use MCMC to marginalize the probability of my data conforming to different models. I am currently implementing something like this:

weights = numpyro.sample("match", dist.Dirichlet(jnp.ones(ref_cyc_n)))
assignment = numpyro.sample("assignment", dist.Categorical(weights), infer={"enumerate": "sequential"})

assigned_ref_prob = Vindex(ref_prob)[assignment,:]

In short, I sample from a Dirichlet, to draw from a catergorical and use the categorical assignment sample to index my data. This limits the samplers I can use for my problem. Does anyone know if there may be a way where I can approximate/implement this using continuous variables? Thanks a lot!

HughMcDougallAstro · February 25, 2024, 3:21pm

If I’m understanding this right, this is an example of a mixture model, yes? You have a sequence of models that each produce different outputs, and you want to squash over different ways of describing the data. Basically saying that the outputs might be drawn from any one of them with some level of probability, and then marginalize over what those probabilities could be?

assignment 
# working for model 1...
ydist_model1 = [SOME DISTRIBUTION]

# working for model 2...
ydist_model2 = [SOME DISTRIBUTION]

# Continue for as many models as you need...

# Bring together with a mixture model

mixture_dist = numpyro_ext.distributions.MixtureGeneral(assignment, [ydist_model1 , ydist_model2, ... ])

y = numpyro.sample('y', mixture_dist, obs=Y)

anmunozj · February 26, 2024, 4:49am

Thank you so much @HughMcDougallAstro !! I know it was a simple concept, but this really helped!

With this change I am able to use ESS, while before I couldn’t. You are the best.