The tutorial says that we need to define an optimizer for an SVI instance, with the learning rate etc. specified, like:
# set up the optimizer
adam_params = {"lr": 0.0005, "betas": (0.90, 0.999)}
optimizer = Adam(adam_params)
# setup the inference algorithm
svi = SVI(model, guide, optimizer, loss=Trace_ELBO())
n_steps = 5000
# do gradient steps
for step in range(n_steps):
svi.step(data)
The loss (ELBO loss) measures the difference between the variational distribution q(z)
and the true posterior distribution p(z|x)
.
However, the optimizer for a deterministic neural network minimizes a loss function (e.g., softmax) which measures the difference between the network output and the input samples.
My question is: If I want to add the SVI capability to the deterministic neural network that I already had, do I need the SVI optimizer only or both? Thanks.