Chapter 15. Evaluating and tuning a classifier
This chapter covers
- Basic considerations in evaluating classifiers
- Using the Mahout evaluation API
- Measuring the performance of an SGD classifier
- Common classifier problems and how to diagnose them
- Approaches for tuning a classifier
Because evaluation is so important, it’s built into Mahout at a fundamental level. This chapter deals mainly with stage 2 of the classification process, the evaluation and fine-tuning of a classifier to prepare for deployment and to maintain performance in production. We discuss how classifiers are evaluated at a high level and the details of using the Mahout API for evaluation, and we look at an example of how to use the Mahout API. We also present several examples that highlight how performance metrics and diagnostic capabilities of the Mahout evaluation API can be used to diagnose common problems with classifiers. We finish this chapter with a discussion of classifier-tuning strategies and techniques that cover the range from choosing an algorithm to adjusting learning rates.
Classifier evaluation presents several pitfalls, and this chapter also offers ways to avoid the more costly ones. Evaluation is also challenging because classification model internals can be difficult to understand.