Question about SparseGPRegression Model in Pyro GP

Dear Du

Can I share my program code and data on dropbox with you so that you may have a look why X_loc does not change at all? What is your email address?

Hi @junbin.gao, I would prefer to have an open discussion. There are already many discussions about GPLVM in this forum. Each topic has different ways to deal with but the core principle should be the same. It would be nice to have some replicable code with synthesis data so that other Pyro users can play with it or can join the discussion.

That is okay. I simplified my program. X_loc does not change. Not sure where I got wrong.

J.

================ Program============================

import torch
import numpy as np 

from torch.nn import Parameter
 
import matplotlib.pyplot as plt  
import warnings
warnings.filterwarnings("ignore")
 
import pyro
import pyro.contrib.gp as gp
import pyro.ops.stats as stats
import pyro.distributions as dist

pyro.enable_validation(True)       # can help with debugging
np.random.seed(0)
torch.manual_seed(0)
pyro.set_rng_seed(0)

Y = torch.randn(2708, 1043)    
X_prior_mean =  torch.randn(2708, 32)  
 
pyro.clear_param_store()

latentDim = X_prior_mean.shape[1]
 
kernel = gp.kernels.RBF(input_dim=latentDim,
                        lengthscale=torch.ones(latentDim))
 
X_prior_mean =  X_prior_mean.to(torch.float64) 
X = Parameter(X_prior_mean.clone())

Xu = stats.resample(X_prior_mean.clone(), 32)

gplvm = gp.models.SparseGPRegression(X, Y.T, 
            kernel, Xu, noise=torch.tensor(0.0001), jitter=1e-5)

# gplvm.X  = pyro.nn.PyroSample(dist.MultivariateNormal(
#     X_prior_mean.new_zeros(X_prior_mean.shape[1], X_prior_mean.shape[0]),
#     precision_matrix = torch.from_numpy(L) ).to_event())
# gplvm.X = pyro.nn.PyroSample(dist.MultivariateNormal(
#     X_prior_mean.t(), precision_matrix = 0.1 * torch.from_numpy(L) ))  
gplvm.X = pyro.nn.PyroSample(dist.Normal(X_prior_mean, 0.1).to_event())

#gplvm.autoguide("X", dist.Normal)    # This one does not work either
gplvm.autoguide("X", dist.MultivariateNormal)

gplvm.mode = "guide"
# Printing a sample of latent variable from unlearned posterior,
# i.e., the guide (Pyro's term)
print(gplvm.X_loc)    
print(gplvm.X_scale_tril_unconstrained)  # All zeros 

#note that training is expected to take a minute or so
gplvm.mode = "model"
gplvm.set_data(X_prior_mean, Y.T)
# Please change num_step to 1000
losses = gp.util.train(gplvm, num_steps=10)

gplvm.mode = "guide"
print(gplvm.X_loc)  #  This does not change
print(gplvm.X_scale_tril_unconstrained) # All zeros no change
plt.figure()
plt.plot(losses)
plt.show()

#gplvm.mode = "guide"
#X = gplvm.X  # draw a sample from the guide of the variable X

#Please the mean of posterior Gaussian is X_loc,
# so we shall extract them as the latent variables (i.e., Data's embedding)
# This is the mean of posterior Gaussian. We need transpose it if
# using MultivariateNormal posterior or prior 
X = gplvm.X_loc.detach().numpy()  #.T     
plt.figure()
plt.scatter(X[:,0],X[:,1])
# This visualization may not be good, because it is
# only 2 dimension of latentDim (=35)
plt.show()    

will set gplvm.X = X_prior_mean. So X is not a PyroSample anymore.

How may I modify the statement, that is how to initialize the latent values?

I delete
gplvm.set_data(X, Y.T)

then X_loc starts changing. But I got another shape issue. Will send you the program

J.

I want to use MVN for prior, that is, a Gaussian over the features (crossing data) of latent X, however this makes a shape issue. In fact, I also want posterior in the similar shape. Can you help sort it out? -J

import torch
import numpy as np 
from torch.nn import Parameter
import matplotlib.pyplot as plt  
import warnings
warnings.filterwarnings("ignore")
import pyro
import pyro.contrib.gp as gp
import pyro.ops.stats as stats
import pyro.distributions as dist
pyro.enable_validation(True)       # can help with debugging
np.random.seed(0)
torch.manual_seed(0)
pyro.set_rng_seed(0)

Y = torch.randn(2708, 1043)    
X_prior_mean =  torch.randn(2708, 32)  
 
pyro.clear_param_store()

latentDim = X_prior_mean.shape[1]
 
kernel = gp.kernels.RBF(input_dim=latentDim,
                        lengthscale=torch.ones(latentDim))
 
X_prior_mean =  X_prior_mean.to(torch.float64) 
X = Parameter(X_prior_mean.clone())

Xu = stats.resample(X_prior_mean.clone(), 35)

gplvm = gp.models.SparseGPRegression(X, Y.T, 
            kernel, Xu, noise=torch.tensor(0.0001), jitter=1e-5)

gplvm.X  = pyro.nn.PyroSample(dist.MultivariateNormal(
    X_prior_mean.new_zeros(X_prior_mean.shape[1], X_prior_mean.shape[0]), scale_tril = torch.eye(2708).to(torch.float64)  ).to_event())
#This makes X in shape [32, 2708], while it should be [2708, 32], how to reshape this?
 
#gplvm.X = pyro.nn.PyroSample(dist.Normal(X_prior_mean, 0.1).to_event())   # This line works

#gplvm.autoguide("X", dist.Normal)     
gplvm.autoguide("X", dist.MultivariateNormal)

gplvm.mode = "guide"
print(gplvm.X_loc)    
print(gplvm.X_scale_tril_unconstrained)  # All zeros 

#note that training is expected to take a minute or so
gplvm.mode = "model"
#gplvm.set_data(X, Y.T)
losses = gp.util.train(gplvm, num_steps=10)

gplvm.mode = "guide"
print(gplvm.X_loc)  #  This does not change
print(gplvm.X_scale_tril_unconstrained) # All zeros no change
plt.figure()
plt.plot(losses)
plt.show()

#gplvm.mode = "guide"
#X = gplvm.X  # draw a sample from the guide of the variable X
X = gplvm.X_loc.detach().numpy().T     
plt.figure()
plt.scatter(X[:,0],X[:,1])
#This visualization may not be good, because it is only 2 dimension of latentDim (=35)
plt.show()

You can change X to X.t() by creating an attribute gplvm.X_t = PyroSample(...) then training the model/guide

def model(...):
    gplvm.set_data(gplvm.X_t.t(), gplvm.y)
    gplvm.model(...)

def guide(...):
    gplvm.set_data(gplvm.X_t.t(), gplvm.y)
    gplvm.guide(...)

You can also just using pyro.sample, which is more flexible because you wanted a structured covariance matrix.

def model(...):
    X_t = pyro.sample("X_t", dist.Normal(...))
    ...

def guide(...):
    loc = pyro.param(...)
    custom_cov = ...
    X_t = pyro.sample("X_t", dist.MultivariateNormal(loc, custom_cov))
    ...

p/s: Could you reformat your post to make it easier to follow? You can mimic how I formatted the code in your previous comment.

Not sure where to put these def model(...) and def guide(...). Can you help give an example or make a little changes to my code?

model, guide are used to do SVI inference. You can read SVI tutorial on how to do SVI in Pyro and this forum topic on how to use it to do mini-batching GPLVM.

Thank Du. No clue yet. Have to give it up. Any fully completed example?

The above forum topic has a full complete example on how to use model, guide pair to do mini-batch GPLVM. I think there are also other forum topics, where model, guide are used. Also, this notebook does SVI inference over a pair model and guide (an AutoGuide) for a GPLVM model.

Hi DU

My example program starts with your example the notebook. Could you make your example to use structured model and guide? This would be very helpful.

J.

I follow GPLVM to redefine a new one, see below. Do you think it is right?

# Define a differnt prior
class myGPLVM(Parameterized):
    def __init__(self, base_model, L):
        super(myGPLVM, self).__init__()
        if base_model.X.dim() != 2:
            raise ValueError("GPLVM model only works with 2D latent X, but got "
                             "X.dim() = {}.".format(base_model.X.dim()))
        self.base_model = base_model
        self.X = PyroSample(dist.MultivariateNormal(base_model.X.new_zeros(base_model.X.shape[1], base_model.X.shape[0]), precision_matrix = L ).to_event())
        self.autoguide("X", dist.MultivariateNormal)
        self.X_loc.data = base_model.X.t()

    @pyro_method
    def model(self):
        self.mode = "model"
        # X is sampled from prior will be put into base_model
        self.base_model.set_data(self.X.t(), self.base_model.y)
        self.base_model.model() 
        self.X_loc.data = self.base_model.X.t()
        
    @pyro_method
    def guide(self):
        self.mode = "guide"
        # X is sampled from guide will be put into base_model
        self.base_model.set_data(self.X.t(), self.base_model.y)
        self.base_model.guide()

    def forward(self, **kwargs):
        self.mode = "guide"
        self.base_model.set_data(self.X.t(), self.base_model.y)
        return self.base_model(**kwargs)

As it is not allowed to debug into PyroSample(), so I am not very sure what exactly has been done inside this call. It seems to me after
self.X = PyroSample(dist.MultivariateNormal(base_model.X.new_zeros(base_model.X.shape[1], base_model.X.shape[0]), precision_matrix = L ).to_event())
there is no attribute X to self.

-J

@junbin.gao Your solution looks much better than my suggestion. (it seems that the statement self.X_loc.data = self.base_model.X.t() is not needed in def model(self):?)

Could you make your example to use structured model and guide?

It is defined in Cell 11 of that notebook.

there is no attribute X to self

I am not sure. You can print(self.X) and print(self.X) in your self.model and self.guide to interpret its value. Basically, X = self.X is just a shortcut of

X = pyro.sample(some_name, the_prior_defined_in_PyroSample)

Unfortunately if I remove self.X_loc.data = self.base_model.X.t(), then X_loc does not change anymore? How to make X a variable?

Hmm, it is tricky. I don’t know why…

Hahaha, find the error, I missed optimiser.step(). Thanks!

But got a trouble with

loss = loss_fn(gplvm.model, gplvm.guide) + regularizer*reg_over(gplvm.X_loc)

which produces the following error:

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

How shall I add this regularizer reg_over(gplvm.X_loc) to the loss?

Here you are: pyro.factor

Thanks Due. Where shall I add this pyro.factor? In model definition? Shall I take log on this factor, or pyro will handle the log?

I think I am getting to understand pyro slowly.

J.

I think the best way to find an answer is to search for it, either in the documentation or in the forum. You can find how to use pyro.factor here or from the documentation in my last comment.