If I understand correctly, Pyro implements SVI optimization with ELBO loss the following way:
- Decreasing the estimate of expected log(q(z)) over the variational parameters (phi), which are specified in
pyro.paramstatements in the guide function - this term of loss could be obtained using
- Increasing estimate of the expected log(p(z|x)) plus the log(p(z)) over the model parameters (theta) - actually
My question is about the theta parameters. Expectably there are none of them in
pyro.get_param_store() in the case if nothing was explicitly specified in the model function using
pyro.param. But even in this case
model_tr.log_prob_sum() is increasing during the training process - which ‘degrees of freedom’ are used for that? What actually is optimized?
Thanks a lot!