I am working on the Gaussian Process Tutorial http://pyro.ai/examples/gp.html

I was able to reproduce the exact result on the first dataset of the tutorial (N=20) using GPRegression, using the given code.

```
import torch
import pyro
import pyro.contrib.gp as gp
import pyro.distributions as dist
from pyro.infer import SVI, Trace_ELBO
from pyro.optim import Adam
smoke_test = ('CI' in os.environ) # ignore; used for checking code integrity in the Pyro repo
pyro.enable_validation(True) # can help with debugging
pyro.set_rng_seed(0)
N = 20
X = dist.Uniform(0.0, 5.0).sample(sample_shape=(N,))
y = 0.5 * torch.sin(3*X) + dist.Normal(0.0, 0.2).sample(sample_shape=(N,))
kernel = gp.kernels.RBF(input_dim=1)
gpr = gp.models.GPRegression(X, y, kernel, noise=torch.tensor(1.))
optim = Adam({"lr": 0.005})
svi = SVI(gpr.model, gpr.guide, optim, loss=Trace_ELBO())
losses = []
num_steps = 2500 if not smoke_test else 2
for i in range(num_steps):
losses.append(svi.step())
```

However, when I try to do inference on the same data with Variational Inference, using the following code,

```
kernel = gp.kernels.RBF(input_dim=1)
likelihood = gp.likelihoods.Gaussian()
vgp = gp.models.VariationalGP(X, y, kernel, likelihood=likelihood, whiten=True)
pyro.clear_param_store()
num_steps = 2500 if not smoke_test else 2
losses = vgp.optimize(optimizer=Adam({"lr": 0.01}), num_steps=num_steps)
```

It almost always breaks because cholesky decomposition can not be performed due to numerical issues,

```
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/fengc/anaconda2/envs/pyro/lib/python3.6/site-packages/pyro/contrib/gp/models/model.py", line 199, in optimize
losses.append(svi.step())
File "/home/fengc/anaconda2/envs/pyro/lib/python3.6/site-packages/pyro/infer/svi.py", line 75, in step
loss = self.loss_and_grads(self.model, self.guide, *args, **kwargs)
File "/home/fengc/anaconda2/envs/pyro/lib/python3.6/site-packages/pyro/infer/trace_elbo.py", line 104, in loss_and_grads
for model_trace, guide_trace in self._get_traces(model, guide, *args, **kwargs):
File "/home/fengc/anaconda2/envs/pyro/lib/python3.6/site-packages/pyro/infer/trace_elbo.py", line 52, in _get_traces
model_trace = poutine.trace(poutine.replay(model, trace=guide_trace)).get_trace(*args, **kwargs)
File "/home/fengc/anaconda2/envs/pyro/lib/python3.6/site-packages/pyro/poutine/trace_messenger.py", line 198, in get_trace
self(*args, **kwargs)
File "/home/fengc/anaconda2/envs/pyro/lib/python3.6/site-packages/pyro/poutine/trace_messenger.py", line 182, in __call__
ret = self.fn(*args, **kwargs)
File "/home/fengc/anaconda2/envs/pyro/lib/python3.6/site-packages/pyro/poutine/messenger.py", line 148, in __call__
return self.fn(*args, **kwargs)
File "/home/fengc/anaconda2/envs/pyro/lib/python3.6/site-packages/pyro/contrib/gp/models/vsgp.py", line 117, in model
Luu = Kuu.potrf(upper=False)
RuntimeError: Lapack Error in potrf : the leading minor of order 14 is not positive definite at /pytorch/aten/src/TH/generic/THTensorLapack.c:617
```

I also tried add variance to the Gaussian likelihood and using Rprop instead of Adam for SGD, and change jitter to 1e-5 or 1e-4. But it does not help, the SGD still stops at certain point of the iteration.

I am wondering what is the best practice of doing Variational inference for Gaussian process.