Part 3. Classification
This third and final part of Mahout in Action, comprising chapters 13 through 17, covers how to use Mahout for classification. Using the techniques presented here, you’ll be able to structure questions and choose and prepare data appropriately to have machines automatically assign data to preselected categories. Classification is a simplified form of decision making that gives discrete answers to an individual question. Machine-based classification is an automation of this decision making process that learns from examples of correct decision making and emulates those decisions automatically—a core concept in predictive analytics. Classification’s reliance on guided learning and focus on answering one question at a time distinguish it from clustering and recommendation, discussed in the previous two parts of this book. Clustering, in contrast to classification, relies on machines to decide on their own; recommenders select and rank the best of many possible answers.