Latent Dirichlet Allocation in Pyro/PyTorch

Hi folks,

I am looking for an implementation of LDA or SVI-LDA (Hoffman, M. D., Blei, D. M., Wang, C., & Paisley, J. W. (2013). Stochastic variational inference. Journal of Machine Learning Research, 14(1), 1303-1347) in Pyro/PyTorch. Please share if you have any implementation.

Thanks!

Read the examples: Example: Amortized Latent Dirichlet Allocation — Pyro Tutorials 1.8.4 documentation

Thanks Dave for sharing. Do you consider this as the state-of-the-art implementation on LDA compared to sklearn.decomposition.LatentDirichletAllocation? Can you comment on any similar methods to achieve LDA?

It depends what you mean by “state of the art”. LDA by itself isn’t really state of the art anymore as far as topic models more generally are concerned. If you like the fairly simple yet expressive nature of LDA, then sure, this is a good implementation and will scale to rather larger datasets. As far as modifications to this go: what are you trying to achieve? I need more info on your problem space before I can give more detailed advice about model structure or inference methods.

Thanks Dave for the response. I am planning to implement Supervised LDA (Blei and McAuliffe, 2010) in Pyro. I am new to Pyro so I am excited to contribute. Any guidance in this endeavor is appreciated.

Great. Luckily for you, if that’s all you want to do then figure 1 of the paper tells you the model structure and section 3 concerns inference. Actually you really don’t need pyro to do this since there are analytical variational approximations derived in the paper. But, if you want to use black box SVI a la pyro for some reason, it should not be so hard to a) use an autoguide over all continuous rvs and marginalize out all discrete ones in the guide using poutine.block or b) perform amortized inference by constructing some or all of the transformations in the guide using the NN architecture of your choice. Does this make sense?

1 Like

Thanks Dave for providing the details. I am interested in understanding and possibly implement Black box SVI/AEVB for supervised topic models. Could you point me to the tutorials/sources in doing the following things you suggested: “a) use an autoguide over all continuous rvs and marginalize out all discrete ones in the guide using poutine.block or b) perform amortized inference by constructing some or all of the transformations in the guide using the NN architecture of your choice”. Please advise.

Hi @chandu, have a look at this tutorial illustrating amortized variational inference in a topic model in Pyro. (sorry it was not visible - it seems to have inadvertently been dropped from the table of contents in a recent docs update).

Thanks for sharing. It is very helpful.