Why does ELBO loss oscillate while fitting SVI using an AutoDelta?

I get the intuition that when using an Autodelta, and no batching, there should not be any random sampling going on and I would expect the model would follow the gradient smoothly… why do I see an oscillation of the loss over optimization?

probably because of the choices of optimization algorithm you’re using. you should only expect a smooth loss curve in the limit of infinitely small step sizes (learning rates). as the learning rate gets larger and larger the loss curve will get jumpier and jumpier

So using LBFGS should partially fix the problem?

it might, at least assuming there isn’t some other source of stochasticity (e.g. data subsampling a.k.a. mini-batching). however note that LBFGS is expected to be slow if the parameter space is sufficiently high dimensional.