Chapter 8. Advanced NLP example: movie review sentiment
This chapter covers
- Using a real-world dataset for predicting sentiment from movie reviews
- Exploring possible use cases for this data and the appropriate modeling strategy
- Building an initial model using basic NLP features and optimizing the parameters
- Improving the accuracy of the model by extracting more-advanced NLP features
- Scaling and other deployment aspects of using this model in production
In this chapter, you’ll use some of the advanced feature-engineering knowledge acquired in the previous chapter to solve a real-world problem. Specifically, you’ll use advanced text and NLP feature-engineering processes to build and optimize a model based on user-submitted reviews of movies.
As always, you’ll start by investigating and analyzing the dataset at hand to understand the feature and target columns so you can make the best decisions about which feature-extraction and ML algorithms to use. You’ll then build the initial model from the simplest feature-extraction algorithms to see how you can quickly get a useful model with only a few lines of code. Next, you’ll dig a little deeper into the library of feature-extraction and ML modeling algorithms to improve the accuracy of the model even further. You’ll conclude by exploring various deployment and scalability aspects of putting the model into production.
8.5. Terms from this chapter
Word |
Definition |
---|---|
word2vec | An NLP modeling framework, initially released by Google and used in many state-of-the-art machine-learning systems involving natural language |
hyperparameter optimization | Various techniques for choosing parameters that control ML algorithms’ execution to maximize their performance |