chapter nine

9 From single trees to forests: Leo Breiman and the logic of ensemble learning

 

This chapter covers

  • Leo Breiman’s Random Forests (2001) and the emergence of ensemble learning from unstable single trees
  • Decision trees as expressive but high-variance learners and the roots of their generalization failure
  • How bootstrap aggregation and the use of voting or averaging stabilizes noisy predictors
  • The strength-correlation framework as a theory of ensemble generalization
  • Random forests as a trade-off between local interpretability and global predictive reliability

The previous chapter examined how Vladimir Vapnik confronted one of the earliest and most persistent failures of machine learning: models that achieved impressive accuracy on training data yet performed unreliably on new examples. Vapnik’s response was to redefine learning itself. Rather than chasing accuracy alone, support vector machines—particularly soft-margin SVMs—sought generalization by explicitly controlling model capacity, balancing margin width against classification error, and grounding learning in statistical theory. Geometry became a disciplined safeguard against overfitting rather than a mere visualization of decision boundaries.

9.1 Why single decision trees fail to generalize

9.1.1 The appeal—and the trap—of recursive splitting

9.1.2 Instability, variance, and overfitting

9.1.3 A critical insight

9.2 How a decision tree actually learns

9.2.1 Recursive partitioning

9.2.2 Impurity and information

9.2.3 Overfitting as depth increases

9.2.4 Interpreting a decision tree: from root to leaf

9.3 Bagging: the first step toward stability

9.3.1 Bootstrap aggregation

9.3.2 What bagging fixes—and what it doesn’t

9.4 Random forests: injecting randomness where it matters

9.4.1 Breiman’s defining move

9.4.2 Why feature randomness matters

9.4.3 A conceptual definition

9.5 Strength, correlation, and generalization: Breiman’s theory

9.5.1 Margin in random forests

9.5.2 The strength–correlation trade-off

9.5.3 Why random forests don’t overfit as trees grow

9.5.4 Closing perspective

9.6 Out-of-bag error: internal validation without a test set

9.6.1 What “out-of-bag” really means

9.6.2 Why this mattered historically

9.6.3 Conceptual importance

9.7 Variable importance and the forest as a glass box

9.7.1 From black box to diagnostic tool

9.7.2 Limits and cautions

9.7.3 Interpreting importance responsibly