Chapter 6. Combining classifiers

 

This chapter covers:

  • Evaluating baselines for classifiers
  • Comparing classifiers and understanding complex datasets
  • The nuts and bolts of bootstrap aggregating
  • Basics of boosting

Epictetus, an ancient Greek philosopher, proclaimed “One must neither tie a ship to a single anchor, nor life to a single hope.” Similarly, we don’t have to rely on a single classifier. No single classifier can provide infallible decision-making capability. In fact, there are plenty of examples that demonstrate the great potential of combining classifiers, and this chapter will provide an introduction to that fascinating subject. In the context of recommendation systems (see chapter 3), Bell, Koren, and Volinsky have recently employed similar ideas with great success.

The main idea behind combining classifiers is achieving better classification results at the expense of computational complexity and higher computational cost (for example, longer computational times or additional computational resources). The combination of classifiers is divided into two general categories—classifier fusion and classifier selection. In the category of classifier fusion, all classifiers contribute to a given classification; so, every classifier must cover the entire domain of possible data points. In classifier selection, each classifier is responsible for a particular domain of data points and is supposed to perform well only within its region of influence.

6.1. Credit worthiness: a case study for combining classifiers

6.2. Credit evaluation with a single classifier

6.3. Comparing multiple classifiers on the same data

6.4. Bagging: bootstrap aggregating

6.5. Boosting: an iterative improvement approach

6.6. Summary

6.7. To Do

6.8. References