Hi there,
I am trying to fit a mixture of Gaussians to my data. I got two questions:
Is the approach I am using the right one?
How can assign membership probabilities to each data point? I mean, I would like to know the probability of each data point belonging to one of the Gaussians.
If I access the model results with get_samples for instance, I get the probabilities over the length of the sample. I would like these probabilities for each data point (over the length of the data). Do I have to iterate inside the model (with a plate) over the length of the data and do something (what exactly I don’t know)?
In the data I have, one of the components is clearly larger than the other (this is why I assigned the weights to be 0.7 and 0.3, respectively). See the plot below. However, in the results I get, the weights are about 50, 50.
Sorry, my last comment is wrong. I guess you can do:
d = mixture_distribution
log_probs = d.component_distributions.log_prob(data.unsqueeze(-1)) + d.mixing_distributions.logits
probs = softmax(log_probs, axis=-1)
# optional: store this probs in the trace
numpyro.deterministic("probs", probs)
If you use the dev branch (or the latest release), you can do