Hello,
I’m using pyro.contrib.gp to train my data but I have encountered a weird issue.
Say we define 2 GPR models as follows:
kernel_init = gp.kernels.RBF(input_dim= dimension, variance=torch.tensor(1.),
lengthscale=length_scale_init)
gpr_init = gp.models.GPRegression(train_x, train_y, kernel_init, noise=torch.tensor(0.), jitter = jitter)
kernel = gp.kernels.RBF(input_dim= dimension, variance=torch.tensor(1.),
lengthscale=length_scale_init)
gpr_opt = gp.models.GPRegression(train_x, train_y, kernel, noise=torch.tensor(0.), jitter = jitter)
Then after one optimizes the lengthscale parameters of the kernel by
optimizer = torch.optim.Adam([{'params': gpr_opt.kernel.lengthscale_unconstrained}], lr=5e-4)
for i in range(num_steps):
# Zero gradients from previous iteration
optimizer.zero_grad()
# Calc loss and backprop gradients
loss = loss_function(params)
loss.backward()
# Update step
optimizer.step()
losses.append(loss.item())
Finally, you want to compute the values of the two models at some test_x points:
with torch.no_grad():
init_value = gpr_init(test_x, full_cov=False, noiseless=True)
opt_value = gpr_opt(test_x, full_cov=False, noiseless=True)
And weirdly, one gets the same value opt_value = init_value
but each GPR model has a kernel with different length scales. How come? It seems to work if I remove the init_value=...
in the last step after with torch.no_grad():
.