chapter ten
10 Topic Modeling
This chapter covers
- Introduction to topic modelling with Latent Dirichlet Allocation (LDA)
- Overview of
gensim, an NLP toolkit for topic modelling - Implementation of an unsupervised topic modelling approach using
gensim - Introduction of several visualization techniques for topic exploration in data
The previous chapter introduced various NLP and machine learning techniques for topic classification and topic analysis. Here is a reminder of the scenario that you’ve worked on: suppose you work as a content manager for a large news platform. Your platform hosts texts from a wide variety of authors and mainly specializes in the following set of well-established topics: “Politics”, “Finance”, “Science”, “Sports”, and “Arts”. Your task is to decide, for every incoming article, which topic it belongs to and post it under the relevant tab on the platform. Here are some questions for you to consider: