mean field variational inference LDA

sami31 · May 31, 2022, 7:36pm

Hello
I am new to pyro and I need to implement Latent direchlet allocation with mean field variational inference . At the end the goal is to use pyro to implement LDA as Bblei did it in his paper (https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf)

In pyro tutorials I found how to do it using Autoencoding Variational Inference For Topic Models Probabilistic Topic Modeling — Pyro Tutorials 1.8.4 documentation . But this is not what I need and I can’t transform it to what I need.

Could someone help me please? For example by giving me a good tutorial or blog?

martinjankowiak · May 31, 2022, 7:40pm

that paper uses “old school” variational inference, i.e. a form of variational inference that requires closed form update equations and does not make use of stochastic ELBO estimates (apart from possible stochasticity from data-subsampling). the variational inference algorithms in pyro are really geared towards more general black-box stochastic variational inference and as such there’s no particularly compelling reason to implement such an inference algorithm in pyro. there should be lots of open source variational inference algorithms for LDA out there if that’s what you want

sami31 · May 31, 2022, 7:42pm

yeah that’s true. But I need a working implementation of LDA with mean filed variational inference for benchmarking purposes in my project. The problem is that I don’t know how to write it my self since I am new to pyro and I can’t find any tutorial on the internet that is explaining that.

martinjankowiak · May 31, 2022, 8:05pm

my point is that pyro’s variational inference functionality probably wouldn’t help you implement such an LDA algorithm. you might as well implement it in raw pytorch, jax, tensorflow, or numpy

sami31 · May 31, 2022, 8:14pm

Could you give me some of the reasons why Pyro functionality would not allow me to implement such a thing?

martinjankowiak · May 31, 2022, 8:59pm

pyro is focused on black-box updates. it cannot compute closed-form updates for you. therefore they would need to be coded by hand. if you’re coding them by hand anyway, what is the point of doing it in pyro?

sami31 · May 31, 2022, 9:04pm

As I said it’s for the sake of benchmarking.
Can you help me code this?

sami31 · June 1, 2022, 6:33pm

Is there anyone who can help?

eb8680_2 · June 1, 2022, 10:54pm

@sami31 the probabilistic topic modeling tutorial you linked to above contains an efficient and idiomatic approach to topic modeling in Pyro, with word topics collapsed manually and documents represented as histograms rather than sequences. If this tutorial cannot be modified to suit your needs, Pyro is probably not the right tool for the job.

Using enumeration to collapse word topics automatically as in the amortized LDA example script or your GitHub issue is elegant but will not scale favorably to large vocabularies or documents.

sami31 · June 4, 2022, 2:48pm

@eb8680_2 Thank you for your comment.
I need to compute the perplexity of the LDA model that is trained in Probabilistic Topic Modeling — Pyro Tutorials 1.8.4 documentation.
Is there a way to do it in pyro or with any other tool? should I export the trained model in any specific format?