How to make predictions with a basic GP?

Hello, I followed the instructions for the basic Gaussian process regression example on the docs website ( and I get predictions on new data that are all the same regardless of the input data. There is no clear example on the website of how to make predictions with new data so I am not sure if I have some error related to that. Here is my full code

import pyro  # type: ignore
import as gp  # type: ignore
import torch
from import DataLoader
from tqdm import tqdm, trange  # type: ignore

from .dataloader import GPLoader

device = torch.device("cuda:1")

def train() -> None:
    dataset = GPLoader(["2017"], device)

    kernel = gp.kernels.RBF(input_dim=707)
    gpr = gp.models.GPRegression(, dataset.y.squeeze(1).to(device), kernel

    optimizer = torch.optim.Adam(gpr.parameters(), lr=5)
    loss_fn = pyro.infer.Trace_ELBO().differentiable_loss

    loss_log = tqdm(desc="Losses", leave=False, position=1)
    for i in trange(100, desc="Iter", position=0, leave=False):

        loss = loss_fn(gpr.model,
        loss_log.set_description(f"loss: {loss.item()}")


    test_set = GPLoader(["2017"], device)
    dataloader = DataLoader(test_set, batch_size=16)

    for i, (x, y) in enumerate(tqdm(dataloader, desc="Test", leave=False, position=1)):
        mu, sigma = gpr(x)
        print(mu, sigma)

Does anyone see anything wrong in this code?

Hi @deltaskelta, the section you pointed out does prediction on new data.

predictions on new data that are all the same regardless of the input data

This is strange. Probably the optimized parameters get wrong values? Could you please make add some more information about parameters, loss,… because I can’t replicate your code? Your code looks good to me, though lr=5 is quite surprising to me.

I wasn’t sure exactly what to post so I added a small example repo here ( that has an out file that shows the loss output. If you want to run the code yourself you can do python

@deltaskelta I haven’t checked the code but your output suggests that the output y has a large scale. I think it is better to scale down your data, rather than let their values be 737500, 290000,…

To get predictions from a GP model, it is enough to call gpr(x_new) as you did. It is also presented in the plot function of the GP tutorial (through plot_predictions=True).

ok, but is there any reason why I have to scale down the data? What is the reason that a GP can output a large number (it outputs somewhere near the mean of all my data) but it can’t provide any accuracy at that level but it can provide accuracy if it is scaled down…I’m very confused

Good question! I don’t have a clear answer for that. I have learned that I should do it and have used that rule of thumb until now. :smiley: