Hi there! I am new to Pyro and I am trying to implement Gaussian Process Dynamical Models (GPDMs) (see http://www.dgp.toronto.edu/~jmwang/gpdm/nips05final.pdf for reference). Basically, a GPDM is a GPLVM with additional GPs governing the dynamics of the latent states. Given a sequence of N observations [y_1,…,y_N] (each y_i has dimension D) I want to find the associated latent sequence [x_1,…,x_N], (each x_i has dimension d<<D). The latent mapping x->y and the dynamic x(t)->x(t+1) are assumed to be governed by distinct GPs.
Following the GPLVM tutorial, I came up with this basic implementation. (
Y is the N x D observation matrix and
X_prior the initialization of the latent variables made with PCA).
# kernel of the latent map GP kernelY = gp.kernels.RBF(input_dim=d, lengthscale=torch.ones(d)) # kernel of the dynamics GP kernelX = gp.kernels.RBF(input_dim=d, lengthscale=torch.ones(d)) # initialize latent matrix N x D as parameter X = Parameter(X_prior.clone()) # latent map GP X->Y gpy = gp.models.GPRegression(X, Y.t(), kernelY, noise=torch.tensor(0.01), jitter=1e-5) # dynamics GP X(1:N-1)->X(2:N) gpx = gp.models.GPRegression(X[:-1,:], X[1:,:].t(), kernelX, noise=torch.tensor(0.01), jitter=1e-5) optimizer = torch.optim.Adam(list(gpy.parameters()) + list(gpx.parameters()), lr=0.01) loss_fn = pyro.infer.Trace_ELBO().differentiable_loss num_steps = 2500 for i in range(num_steps): optimizer.zero_grad() loss = loss_fn(gpy.model, gpy.guide) + loss_fn(gpx.model, gpx.guide) loss.backward() optimizer.step()
The code works fine, I can fit the models and find a reasonable latent evolution. But, If instead of slicing
X[1:,:], I use indices like
in_indices = list(range(0,N-1)) out_indices = list(range(1,N)) gpx = gp.models.GPRegression(X[in_indices,:], X[out_indices,:].t(), kernelX, noise=torch.tensor(0.01), jitter=1e-5)
I get the error
RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.
and if I use
loss.backward(retain_graph=True) the optimization cannot converge to a good solution. I tried also to use
pyro.ops.indexing but the problem persists.
Does anyone know the reason of the different effects obtained with slicing or indexing and how I should properly index
X in this context?
Also, if someone has implemented GPDM using Pyro before, feel free to give me any kind of suggestion.
P.S. I need to index
X to model two different sequences of observations of length N1 and N2, because I must ensure that the first latent state of the 2nd sequence does not depend on the last of the 1st sequence (
in_indeces = list(range(0,N1-1))+list(range(N1,N1+N2-1)) and
out_indices = list(range(1,N1))+list(range(N1+1,N1+N2))).