MCMC with Pyro's GPLVM doesn't generate correct posterior

I’m trying to perform MCMC using a convoluted example of PCA where the posterior is known to be non-gaussian, using the GPLVM module, but I don’t seem to recover the correct posterior (for “X” / latent variable).

A minimal working example, where I code up the model and compare the results to the gplvm class:


import pyro, torch
import numpy as np
import pyro.contrib.gp as gp
import pyro.distributions as dist
import matplotlib.pyplot as plt

plt.ion(); plt.style.use('ggplot')

def mcmc(model):
    nuts = pyro.infer.NUTS(model)
    MCMC = pyro.infer.MCMC(
            kernel=nuts,
            num_samples=300,
            warmup_steps=300,
            num_chains=1)
    MCMC.run()

    X_mcmc = MCMC.get_samples()['X']
    plt.scatter(X_mcmc[:, 0, 0], X_mcmc[:, 0, 1], alpha=0.1)

if __name__ == '__main__':

    #########################################
    # Create Synthetic Data

    np.random.seed(42)
    n = 2; q = 2; m = 10

    X = np.random.normal(size = (n, q))
    W = np.random.normal(size = (q, m))
    Y = torch.tensor(X @ W).float()

    #########################################
    # MCMC with model

    def model():
        zeros = torch.zeros((n, q))
        ones = torch.ones((n, q))
        X = pyro.sample('X', dist.Normal(zeros, ones))

        zeros = torch.tensor([0]).float()
        sigma = X @ X.T + torch.eye(n)*1e-3
        sigma = torch.cat([sigma[None, ...]]*m, axis=0)
        pyro.sample('Y', dist.MultivariateNormal(zeros, sigma), obs=Y.T)

    mcmc(model) # a ring

    '''
    MCMC with pyro's GPLVM doesn't generate the right latent distribution.
    '''

    X_init = torch.tensor(np.random.normal(size=(n, q))).float()

    kernel = gp.kernels.Linear(q)
    kernel.variance = pyro.sample('variance', dist.Delta(torch.ones(2)))

    gpmodule = gp.models.GPRegression(X_init, Y.T, kernel, jitter=1e-6)
    gplvm = gp.models.GPLVM(gpmodule)

    mcmc(gplvm.model) # a blob

@aditya It seems that you missed the term noise=torch.tensor(1e-3) in GPRegression.

@fehiepsi Interesting. This does seem to work. What’s the difference between jitter and noise? (in this example, as the noise is small too; also I don’t see any other variables in the mcmc trace, e.g. noise)

jitter is used to deal with some numerical issues when taking Cholesky of covariance matrix. You can set jitter=0 if there is no numerical issue with your model. I looked at your model and I saw you added a white noise to sigma. The noise term does that role. By default, GPR has noise=1.

Gotcha - the standard deviation was too big, amazing. Is this optimised? as I couldn’t find the trace of this in the MCMC trace.

If you want white noise to be a latent variable, then you can do something like

gpr.noise = PyroSample(dist.LogNormal(-3, 1))

MCMC will sample both X and noise for you.