Alternate form of the ELBO allows easier batching, I think?

Trace_ELBO (link) uses the “entropy form” of the ELBO (using equations from Blei et al 2017 p862)

ELBO(q) = E[log p(z, x)] - E[log q(z)]

But you can alternately use the “likelihood minus KL form”

ELBO(q) = E[log p(x | z)] - KL( q(z) || p(z) )

If you use the second form, batching becomes a lot easier, see code I wrote here for one implementation.

You don’t have to batch via plates, for instance. You can just take whatever model, compute the likelihood, and compute the KL between the surrogate posterior and the prior.

This seems to me to be a major simplification. Any reason why Numpyro doesn’t do this out of the box?

i’m afraid i don’t follow. i don’t see why it would matter for mini-batching, plates, etc. also recall that num/pyro both need to support general plate structure. not just the simplest most vanilla scenario

As I understand it, the current strategy for batching in VI with Numpyro is to use plates. Please correct me if I’m wrong there and missing how to batch models without plates.

This means that you need to write your model differently to batch it, which is friction on the user end.

Without plates, I am not sure how to compute E[log p(z, x)] per batch (perhaps there is a way but I don’t know it.

But it’s easy to compute the second form of the ELBO in this case. So you’ve removed the need for plates to batch which allows easier model building.

not really sure what model structure you’re referring to. if you have e.g. a local latent variable model like a VAE you don’t need plates. it’s really when you have global latent variables and then want to sub-sample local latent variables and/or data that you need plates

I feel like I’m missing something here—could you point me to an example of batching with VI that does not need plates?

Ah, ok so I read/thought about this more. I understand—if you have something like a nested plate the scaling factor N/M will be wrong because it actually varies. I’ll put a note on my implementation for users, that’s very helpful.