chapter eight

8 Sentiment analysis with a data-driven approach

This chapter covers

Implementing improved algorithms for sentiment analysis
Introducing several machine-learning practices and techniques with scikit-learn
Applying linguistic pipeline and linguistic concepts with spaCy
Combining use of spaCy and NLTK resources

In the previous chapter, you started looking into sentiment analysis and implemented your first sentiment analyzer using a lexicon-based approach. Recall that sentiment analysis is concerned with the automated detection of sentiment (usually along two dimensions of positive and negative sentiments) for a piece of text. It is a popular task to apply to such opinionated texts as, for example, reviews on movies, restaurants, products, and services. A good sentiment analyzer may help save the user a lot of time!

8.1 Addressing multiple senses of a word with SentiWordNet

8.2 Addressing dependence on context with machine learning

8.2.1 Data preparation

8.2.2 Extracting features from text

8.2.3 Scikit-learn’s machine-learning pipeline

8.2.4 Full-scale evaluation with cross-validation

8.3 Varying the length of the sentiment-bearing features

8.4 Negation handling for sentiment analysis

8.5 Further practice

Summary