chapter nine

9 From single trees to forests: Leo Breiman and the logic of ensemble learning

This chapter covers

Leo Breiman’s Random Forests (2001) and the emergence of ensemble learning from unstable single trees
Why decision trees, despite their flexibility, fail to generalize due to instability and high variance
How bootstrap sampling, feature randomness, and aggregation combine to stabilize predictions and improve generalization on new data
How Breiman’s strength-correlation framework explains why ensembles succeed where single models fail
How random forests balance predictive power with interpretability through internal diagnostics such as out-of-bag error and variable importance

The previous chapter examined how Vladimir Vapnik confronted one of the earliest and most persistent failures of machine learning: models that achieved impressive accuracy on training data yet performed unreliably on new examples. Vapnik’s response was to redefine learning itself. Rather than chasing accuracy alone, support vector machines—particularly soft-margin SVMs—sought generalization by explicitly controlling model capacity, balancing margin width against classification error, and grounding learning in statistical theory. Geometry became a disciplined safeguard against overfitting rather than a mere visualization of decision boundaries.

9.1 Why single decision trees fail to generalize

9.1.1 The appeal—and the trap—of recursive splitting

9.1.2 Instability, variance, and overfitting

9.1.3 A critical insight

9.2 How a decision tree actually learns

9.2.1 Recursive partitioning

9.2.2 Evaluating splits: homogeneity and local optimality

9.2.3 Overfitting as depth increases

9.2.4 Interpreting a decision tree: from root to leaf

9.3 Bagging: the first step toward stability

9.3.1 How bagging works: bootstrap sampling and aggregation

9.3.2 What bagging fixes—and what it doesn’t

9.4 Random forests: injecting randomness where it matters

9.4.1 Breiman’s defining move

9.4.2 Why feature randomness matters

9.4.3 A conceptual definition

9.5 Strength, correlation, and generalization: Breiman’s theory

9.5.1 Margin in random forests

9.5.2 The strength–correlation trade-off

9.5.3 Why random forests don’t overfit as trees grow

9.5.4 Closing perspective

9.6 Out-of-bag error: internal validation without a test set

9.6.1 What “out-of-bag” really means

9.6.2 Why this mattered historically

9.6.3 Conceptual importance

9.7 Variable importance and the forest as a glass box

9.7.1 From black box to diagnostic tool

9.7.2 Limits and cautions