Hello,

This may sound strange, but I found that the evolution of kernel lengthscales during a GP regression model training is very different between the newest version of Pyro (1.0.0) and the previous versions (e.g. 0.4.1) for the exact same code on the same dataset.

This is evolution of lengthscales for Pyro 0.4.1:

It makes perfect to me sense since we have different lengthscales in our hyperspectral dataset in spatial (dim 1 , dim 2) and energy (dim 3) dimensions.

And this is for Pyro 1.0.0:

(exact same code, exact same data)

I don’t quite understand what caused such a drastic difference.

I use sparse GP regression with Matern52 kernel and the following constraints on the lengthscale:

```
kernel.set_prior(
"lengthscale",
dist.Uniform(
torch.tensor(lscale[0]),
torch.tensor(lscale[1])
).independent()
)
```

where lscale = [[1., 1., 1.], [20., 20., 20.]]

Thanks in advance!