Multidimensional Gaussian Mixture Model from data


is it possible to infer (to assess, to create) a multidimensional Gaussian Mixture Model just from the matrix of observation using Pyro?
(ideally the number of GMM modes should be also assessed from the data)

Hi, see the Gaussian mixture model tutorial for an introduction to mixture models in Pyro, and the Dirichlet process mixture model tutorial for an example of fitting a mixture model with an unknown number of components.

“we do not parameterize the covariance matrices of the Gaussians, though this should be done when analyzing a real-world dataset for more flexibility”

arghhh… the most interesting part is cut out from Dirichlet process mixture model tutorial :frowning:

any hint on this, please?

You can just sample multiple scales along with the means. For the simpler model in the Gaussian mixture model tutorial, that means moving the scale variable inside the components plate and indexing it with the assignment in the likelihood:

weights = pyro.sample('weights', dist.Dirichlet(0.5 * torch.ones(K)))
with pyro.plate('components', K):
    scales = pyro.sample('scale', dist.LogNormal(0., 2.))
    locs = pyro.sample('locs', dist.Normal(0., 10.))

with pyro.plate('data', len(data)):
    assignment = pyro.sample('assignment', dist.Categorical(weights))
    pyro.sample('obs', dist.Normal(locs[assignment], scales[assignment]), obs=data)

In the multivariate setting, you can do something similar using the LKJCorrCholesky distribution (docs, example) to define a prior distribution over covariance Cholesky factors for each mixture component.

regarding “Dirichlet process mixture model tutorial”:

This section of the tutorial doesn’t seem to be reproducible anymore:

Hi All,

Forgive me, I am new to Pyro but I tried to implement the discussion above as I have a need for a multivariate mixture with a full covariance matrix. I thought the code would be too verbose so I included it in a gist here.

The model infers reasonably well however with two major problems:

  1. A very large number of iterations is required to get to a reasonable estimate, I suspect due to either floating point inference (had trouble with double()) as well as poor initialization.
  2. The LKJ prior passed to scale_tril appears to only learn the lower triangular covariance, which is what scale_tril is for right? How do we allow the upper triangular to be “learnable”?

If anyone has a chance to take a look that would be great.

