Nov 13, 2014

Getting started with Latent Dirichlet Allocation in Python

In this post I will go over installation and basic usage of the lda Python package for Latent Dirichlet Allocation (LDA). I will not go through the theoretical foundations of the method in this post. However, the main reference for this model, Blei et. al. (2003) is freely available online and I think the main idea of assigning documents in a corpus (set of documents) to latent (hidden) topics based on a vector of words is fairly simple to understand and the example will help to solidify our understanding of the LDA model.