Latent Dirichlet Allocation in Pyro/PyTorch

chandu · February 24, 2021, 12:21am

Hi folks,

I am looking for an implementation of LDA or SVI-LDA (Hoffman, M. D., Blei, D. M., Wang, C., & Paisley, J. W. (2013). Stochastic variational inference. Journal of Machine Learning Research, 14(1), 1303-1347) in Pyro/PyTorch. Please share if you have any implementation.

Thanks!

dave · February 24, 2021, 1:30am

Read the examples: Example: Amortized Latent Dirichlet Allocation — Pyro Tutorials 1.8.4 documentation

chandu · February 24, 2021, 2:31am

Thanks Dave for sharing. Do you consider this as the state-of-the-art implementation on LDA compared to sklearn.decomposition.LatentDirichletAllocation? Can you comment on any similar methods to achieve LDA?

dave · February 24, 2021, 3:34am

It depends what you mean by “state of the art”. LDA by itself isn’t really state of the art anymore as far as topic models more generally are concerned. If you like the fairly simple yet expressive nature of LDA, then sure, this is a good implementation and will scale to rather larger datasets. As far as modifications to this go: what are you trying to achieve? I need more info on your problem space before I can give more detailed advice about model structure or inference methods.

chandu · February 24, 2021, 4:43am

Thanks Dave for the response. I am planning to implement Supervised LDA (Blei and McAuliffe, 2010) in Pyro. I am new to Pyro so I am excited to contribute. Any guidance in this endeavor is appreciated.

dave · February 24, 2021, 12:08pm

Great. Luckily for you, if that’s all you want to do then figure 1 of the paper tells you the model structure and section 3 concerns inference. Actually you really don’t need pyro to do this since there are analytical variational approximations derived in the paper. But, if you want to use black box SVI a la pyro for some reason, it should not be so hard to a) use an autoguide over all continuous rvs and marginalize out all discrete ones in the guide using poutine.block or b) perform amortized inference by constructing some or all of the transformations in the guide using the NN architecture of your choice. Does this make sense?

chandu · February 25, 2021, 2:07am

Thanks Dave for providing the details. I am interested in understanding and possibly implement Black box SVI/AEVB for supervised topic models. Could you point me to the tutorials/sources in doing the following things you suggested: “a) use an autoguide over all continuous rvs and marginalize out all discrete ones in the guide using poutine.block or b) perform amortized inference by constructing some or all of the transformations in the guide using the NN architecture of your choice”. Please advise.

eb8680_2 · February 25, 2021, 10:57pm

Hi @chandu, have a look at this tutorial illustrating amortized variational inference in a topic model in Pyro. (sorry it was not visible - it seems to have inadvertently been dropped from the table of contents in a recent docs update).

chandu · February 25, 2021, 11:43pm

Thanks for sharing. It is very helpful.