Extract the KL divergence term from loss. How?

megaloman · June 20, 2019, 11:45pm

I had the same question, Extract KL loss in VAE type models from SVI?, I think what I propose as the solution in that thread is…correct? I’m using a Gaussian w/ diagonal covariance as my prior so I also tried calculating it analytically as well which gives similar results as my semi-hacky solution. One minor discrepancy from using the trace is that Pyro doesn’t calculate it analytically, it approximates it with samples from the re-parameterized distribution (at least in the Gaussian case…) so doing it that way will vary slightly with each call to the trace. Again, not 100% but this seems to be the case.