Inference algorithm

Hi all,

I have a question which relates or adds to this post.

My question is: what is the inference algorithm that pyro use for Trace_ELBO()?

On this page, it says that the estimator uses [1] and [2]. However, in [4] it says that the primary inference algorithm pyro implements is [3]. In the SVI tutorial, there is a section here that references the reparameterisation trick used in [3].

From my understanding of [3], gradient of ELBO is w.r.t both \theta and variational parameter \phi. And in each iteration, ELBO increases by updating \theta, \phi. And in [1] and [2], gradient of ELBO is w.r.t the variational parameters \phi only – the gradient of ELBO w.r.t the model parameter \theta is not computed nor is \theta adjusted in each iteration.

Under the hood, what is the inference algorithm for Trace_ELBO()? Does using Trace_ELBO() involve estimating/computing \nabla_{\theta, \phi}\text{ELBO} or \nabla_{\phi}\text{ELBO}?

References:

[1] Automated Variational Inference in Probabilistic Programming,
David Wingate, Theo Weber

[2] Black Box Variational Inference,
Rajesh Ranganath, Sean Gerrish, David M. Blei

[3] Auto-Encoding Variational Bayes,
Diederik P Kingma, Max Welling

[4] Pyro: Deep Universal Probabilistic Programming
Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, Noah D. Goodman

this is explained in this tutorial. basically pyro uses reparameterized gradients (like in [3]) when it can and backs off onto score function gradients (like in [2]) when it can’t. so it’s always computing stochastic estimates of ∇_{θϕ} ELBO but how it does so depends on whether the guide has discrete latent variables etc.

1 Like

sounds clear now. thanks @martinjankowiak