Chapter 13. Introduction to classification

 

This chapter covers

  • Why Mahout is a powerful choice for classification
  • Key classification concepts and terminology
  • The workflow of a typical classification project
  • A step-by-step classification example

Life often presents us with questions that aren’t open-ended but instead ask us to choose from a limited number of options. This relatively simple idea forms the basis for classification—both that done by humans and that done by machines. Classification relies on the categorization of potential answers, and machine-based classification is the automation of such simplified decisions.

This chapter identifies those cases where Mahout is a good approach for classification and explains why it offers an advantage over other approaches. As an introduction to classification, this chapter also explains what classification is, and provides a foundation to understanding the basic terminology and concepts. In addition, we present a practical overview of how classification works, covering three stages in the workflow for typical classification projects: training the model, evaluating and tuning the model, and using the model in production. Following this introduction, we look at a simple classification project and show you, in a simple step-by-step way, how to put these basic ideas into practice.

13.1. Why use Mahout for classification?

13.2. The fundamentals of classification systems

13.3. How classification works

13.4. Work flow in a typical classification project

13.5. Step-by-step simple classification example

13.6. Summary