Our previous discussions of natural language processing (NLP) techniques focused on toy examples and small datasets. In this section, we execute NLP on large collections of real-world texts. This type of analysis is seemingly straightforward, given the techniques presented thus far. For example, suppose we’re doing market research across multiple online discussion forums. Each forum is composed of hundreds of users who discuss a specific topic, such as politics, fashion, technology, or cars. We want to automatically extract all the discussion topics based on the contents of the user conversations. These extracted topics will be used to plan a marketing campaign, which will target users based on their online interests.