Hi there, this question is not really related to pyro, but a general question on the ELBO. I am experimenting on an variational autoencoder with the output likelihood being poisson/nb/multinomial for gene expression. I noticed that the reconstruction loss dominates the KL divergence, since my latent is of a fixed small dimension, while my reconstruction is of a large dimension. I know theres thing such as Beta-VAE exists, but I am wondering if theres any other theory to support on how to normalize the reconstruction term?

not sure what kind of theory you’re looking for but you might take a look at e.g. this paper and references it cites or references cited by it in google scholar

Thanks for the link. Actually I have found that the KL has nearly no influence at all in my case, yet I get pretty good results in my latent variables. Do you know what might have caused this, as I do not see any papers describing this kind of phenomenon. It make me wonder if the KL is actually needed since the reconstruction always seem to dominate, when the dimensions are disproportionate. I imagine this to happen quite a lot, especially in biology fields such as single cell.

yes the KL regularization term will have relatively little effect when you have high-dimensional data that more-or-less unambiguously pins down the corresponding latent embedding.

if you got rid of the KL entirely the amortized encoder (assuming you’re using one and assuming it’s trained optimally) would become a delta function that places all probability mass on the latent `z`

that has the highest likelihood

Thank you. This helps alot