8 Sentiment analysis with a data-driven approach

 

This chapter covers

  • Implementing improved algorithms for sentiment analysis
  • Introducing several machine-learning practices and techniques with scikit-learn
  • Applying linguistic pipeline and linguistic concepts with spaCy
  • Combining use of spaCy and NLTK resources

In the previous chapter, you started looking into sentiment analysis and implemented your first sentiment analyzer using a lexicon-based approach. Recall that sentiment analysis is concerned with the automated detection of sentiment (usually along two dimensions of positive and negative sentiments) for a piece of text. It is a popular task to apply to such opinionated texts as, for example, reviews on movies, restaurants, products, and services. A good sentiment analyzer may help save the user a lot of time!

8.1 Addressing multiple senses of a word with SentiWordNet

8.2 Addressing dependence on context with machine learning

8.2.1 Data preparation

8.2.2 Extracting features from text

8.2.3 Scikit-learn’s machine-learning pipeline

8.2.4 Full-scale evaluation with cross-validation

8.3 Varying the length of the sentiment-bearing features

8.4 Negation handling for sentiment analysis

8.5 Further practice

Summary