Chapter 1.    Your first NLP example

Chapter 2 from Getting Started with Natural Language Processing by Ekaterina Kochmar

This chapter covers:

  • How to implement your first practical NLP application from scratch
  • How to structure an NLP project from beginning to end
  • A number of useful NLP concepts, including tokenization and text normalization
  • How to apply a Machine Learning algorithm to textual data

In this chapter, you will learn how to implement your own NLP application from scratch. In doing so, you will also learn how to structure a typical NLP pipeline and how to apply a simple machine learning algorithm to solve your task. The particular application you will implement is spam filtering. We overviewed it in Chapter 1 as one of the classic tasks on the intersection of NLP and machine learning.

2.1       Introducing NLP in practice: spam filtering

In this book, you use the spam filtering as your first practical NLP application as it is an example of a very widely spread family of tasks – text classification. Text classification comprises a number of applications that we discuss in this book, for example user profiling (Chapter 5), sentiment analysis (Chapter 6) and topic labeling (Chapter 8), so this chapter will give you a good start for the rest of the book. First, let’s see what exactly classification addresses.

2.2       Understanding the task

2.3        Implementing your own spam filter

2.4       Deploying your spam filter in practice

2.5       Summary