I have a GP that I am fitting in Pyro where I am setting the lengthscale to a large constant, based on prior knowledge:
amp = pyro.sample('amp', Gamma(torch.DoubleTensor([2.]).to(device), torch.DoubleTensor([0.5]).to(device))) K = gp.kernels.RBF( input_dim=1, variance=amp, lengthscale=torch.tensor(100.).to(device) ) cov_beta = K(torch.DoubleTensor(days).to(device)) cov_beta.view(-1)[::D+1] += jitter beta = pyro.sample('beta', MultivariateNormal((torch.ones(D).to(device)), covariance_matrix=cov_beta))
however, when I fit this model, a plot of beta looks as follows:
clearly the lengthscale is much smaller than 100 here. Does SVI change the values of the lengthscale, even though I have specified it, with no prior? It seems like I am doing something wrong here.
I even tried putting a uniform prior on the lengthscale with a lower bound of 100, and I get a similar result:
ls = pyro.sample('ls', Uniform(torch.DoubleTensor([100.]).to(device), torch.DoubleTensor().to(device)))