During the optimization step for GPRegression models it doesn't seem like the jitter value is used anywhere.
I'm trying to use the model to do bayesian optimization and the cholesky decomposition keeps failing during SVI due to small negative values in the kernel matrix. I kept trying to increase the jitter but It did not help, eventually when I looked at the source, jitter value was not used in the
I was wondering if this was intended, and if so maybe I'm setting some parameters wrong? As of right now I created a new kernel that does add some jitter to the kernel matrix to get around the issue.
K = super(SafeMatern52, self).forward(X, Z, diag)
if not diag and Z is None:
K = K + torch.eye(K.shape, K.shape, dtype=torch.float64)*(self.eps)