Vanilla GPLVM implementation

Is it possible to implement the non-Bayesian version of the GPLVM where the latent X and hypers are learnt through optimisation of the marginal likelihood?

I think you can simply set gplvm.X = PyroParam(init_value) after creating an instance gplvm = GPLVM(...).

Thanks, but from looking at the code in GPLVM it seems like it automatically assigns a Normal guide when it is initialised.

This is my testing code for iris data -> aiming to do just vanilla GPLVM (no priors or anything) just aiming for point estimates.

import sklearn.datasets as skd
import matplotlib.pylab as plt
import numpy as np
import pandas as pd
import torch 
import pyro
import pyro.contrib.gp as gp
import pyro.distributions as dist
import pyro.ops.stats as stats

# With y as the 2D Iris data of shape 150x4 and we want to reduce its dimension
# to a tensor X of shape 150x2, we will use GPLVM.

iris_data = skd.load_iris()
df = pd.DataFrame(data=iris_data.data, index=iris_data.target)

# First, define the initial values for X parameter:

X_init = torch.zeros(150, 4)

# 4d iris data

y = torch.tensor(iris_data.data.T).float()
    
kernel = gp.kernels.RBF(input_dim=4, lengthscale=torch.ones(4)) 
Xu = torch.zeros(20, 4)  # initial inducing inputs of sparse model

gpmodule = gp.models.SparseGPRegression(X_init, y, kernel, Xu, jitter=1e-5)

# Finally, wrap gpmodule by GPLVM, optimize, and get the "learned" mean of X:
    
gplvm = gp.models.GPLVM(gpmodule)
X = pyro.nn.PyroParam(X_init)

losses = gp.util.train(gplvm, num_steps=5000)  

# Inspect learned parameters
print("Learned parameters:")
for name, param in gplvm.named_parameters():
    print(name, param.data.numpy())

X_loc = gplvm.X_loc.detach()
X_scale = gplvm.X_scale.detach()

# Plotting

colors = plt.get_cmap("tab10").colors[::-1]
labels = np.unique(iris_data.target)

for i, label in enumerate(labels):
    X_i = X_loc[df.index == label]
    plt.scatter(X_i[:, 0], X_i[:, 1], c=[colors[i]], label=label) 
    plt.scatter(X_i[:, 0], X_i[:, 1], c=[colors[i]], label=label) 

Seems ok but there is still a X_scale …im confused.

@fehiepsi is suggesting you do

gplvm.X = ...

not

X = ...

Thanks,

so essentially,

gplvm.X = pyro.nn.PyroParam(X_init) # is for learning a point estimate

and

gplvm.X = pyro.nn.PyroSample(dist.Normal(0, 0.1).to_event()) # is for learning a distribution

I modified my code to learn a point estimate… as per below:

# With y as the 2D Iris data of shape 150x4 and we want to reduce its dimension
# to a tensor X of shape 150x2, we will use GPLVM.

iris_data = skd.load_iris()
df = pd.DataFrame(data=iris_data.data, index=iris_data.target)

# First, define the initial values for X parameter:

X_init = torch.zeros(150, 4)

# 4d iris data

y = torch.tensor(iris_data.data.T).float()
    
kernel = gp.kernels.RBF(input_dim=4, lengthscale=torch.ones(4)) 
Xu = torch.zeros(20, 4)  # initial inducing inputs of sparse model

gpmodule = gp.models.SparseGPRegression(X_init, y, kernel, Xu, jitter=1e-5)

# Finally, wrap gpmodule by GPLVM, optimize, and get the "learned" mean of X:
    
gplvm = gp.models.GPLVM(gpmodule)
gplvm.X = pyro.nn.PyroParam(X_init)

losses = gp.util.train(gplvm, num_steps=5000)  

# Inspect learned parameters
print("Learned parameters:")
for name, param in gplvm.named_parameters():
    print(name, param.data.numpy())

I am wondering what is ‘X_unconstrained’ in the learned parameters?

@vr308 Sorry, it seems that setting X as a Param does not work (under the hood, we set gplvm.base_model.X = gplvm.X so when X is a PyTorch Parameter, it will create a new parameter base_model.X,… I get lost with that logic). You can use gplvm.autoguide("X", dist.Delta) instead of gplvm.X = PyroParam....

Thanks @fehiepsi, it now gives me a X_map which makes sense.

My eventual aim is to model X with a flexible distribution (eg. a flow based transformed Gaussian), and is it as simple as just replacing the guide() with the normalising flow and leave SVI inference to learn all the variational params?

Yes, I think that it the right way to do: use SVI for gplvm.model and guide = AutoNormalizeFlow(gplvm.model, ...).

@fehiepsi thanks!!

Would be really grateful if you could review the below, I am a bit out of my depth when it comes to writing the custom guide with a flexible distribution… I know that the guide needs to follow the same construction as the model but since the model is wrapped in GPLVM(SparseGPRegression) it gets a bit confusing…

gpmodule = gp.models.SparseGPRegression(X_init, y, kernel, Xu, jitter=1e-4)

#----wrap gpmodule with GPLVM
    
gplvm = gp.models.GPLVM(gpmodule)

#----Learning a point estimate for X

gplvm = gp.models.GPLVM(gpmodule)
gplvm.autoguide("X",  dist.Delta)
losses = gp.util.train(gplvm, num_steps=5000)  # this works ok 

#----Learning a Gaussian distribution over each latent X.

gplvm = gp.models.GPLVM(gpmodule)
gplvm.autoguide(“X”, dist.Normal(0,1))
losses = gp.util.train(gplvm, num_steps=5000) # this works ok too

#----Learning a flexible distribution using flows for X

gplvm = gp.models.GPLVM(gpmodule)

def guide(): ---> not sure what arguments should go here.
    base_dist = dist.MultivariateNormal(loc=torch.zeros(4), covariance_matrix=torch.eye(4))
    planar_transform = trans.Planar(input_dim=4)
    
    # variational params to learn    --> I cant seem to pass these params to Planar transform 
    #bias = pyro.param('bias', torch.tensor(0.))
    #u = pyro.param('u', torch.tensor([1.0]))
    #w = pyro.param('w', torch.tensor([1.0]))
    
    #planar_transform.bias = bias
    
    flow_dist = dist.TransformedDistribution(base_dist, [planar_transform])
    pyro.sample("latent", flow_dist)

gplvm.guide = guide()

adam = pyro.optim.Adam({“lr”: 0.03})
svi = pyro.infer.SVI(model=gplvm.model,
guide=gplvm.guide,
optim=adam,
loss=pyro.infer.Trace_ELBO())

n = 10000; losses = np.zeros(n)
for step in np.arange(n):
losses = svi.step()

I am probably doing a bunch of things incorrectly, but my intuition is that …passing a custom distribution to guide and train the GPLVM with SVI should be fairly doable…

I just have some thoughts. Could you try some of them to see if they work?

  • Because gplvm.model has no input, your guide should also have no input.
  • Because the latent variable in gplvm is X, you should have pyro.sample("X", ...) in your guide.
  • You can train the pair gplvm.model and guide with SVI (no need to redefine or use gplvm.guide).
  • GPLVM is a kind of dimensional reduction technique, so it is better to let X have less than 4 dimensions (e.g. 2). Maybe this is your intention?

Thank you!

Will try that…about your last point, I realised it is better to select latent dimensions based on the size of the inverse lengthscale and let Q = D for training.

@fehiepsi

The thing I haven’t figured out is how to pass the learnable params (flow params) in guide, for instance, the Planar transform has a bias, u and w which I want to declare as params in the guide but there seems to be no way to initialise the Planar Transform with those params.

def guide():
    base_dist = dist.MultivariateNormal(loc=torch.zeros(4), covariance_matrix=torch.eye(4))
    planar_transform = trans.Planar(input_dim=4)
    
    # variational params to learn
    bias = pyro.param('bias', torch.tensor(0.))
    u = pyro.param('u', torch.tensor([1.0]))
    w = pyro.param('w', torch.tensor([1.0]))
    
    planar_transform.bias = bias --> this doesnt work
    
    flow_dist = dist.TransformedDistribution(base_dist, [planar_transform])
    pyro.sample("X", flow_dist)

It seems Planar have some defined parameters. I guess you can do pyro.module("my_transform", transform) as in its docs. Otherwise, you can convert it to a PyroModule.

You might also want to try AutoNormalizingFlow.