13 From foundations to frontier: OpenAI and the scaling laws of modern intelligence
This chapter covers
- Jared Kaplan et al.’s Scaling Laws for Neural Networks (2020) and the discovery that model performance follows power-law relationships
- Why performance in modern AI depends primarily on three interacting resources—model parameters, training data, and compute
- How scaling allows modern learning systems to improve predictably as the volume of training resources grows
- How scaling behavior connects modern AI to earlier statistical ideas about likelihood, information, and generalization
- Why recent advances in AI reflect the large-scale expansion of existing learning systems more so than new algorithms
13.1 The discovery of scaling laws
13.1.1 The empirical puzzle
13.1.2 Measuring how performance scales
13.1.3 The three scaling variables
13.2 The power laws of modern AI
13.2.1 Power-law behavior in model performance
13.2.2 From algorithms to architecture to scale
13.2.3 The compute-efficient frontier
13.3 What scaling revealed about learning systems
13.3.1 Large models and sample efficiency
13.3.2 Scaling and generalization
13.3.3 Transfer and emergent capability
13.4 Scaling as the synthesis of earlier ideas
13.4.1 Fisher and likelihood-based learning
13.4.2 Shannon and information
13.4.3 Vapnik and generalization
13.4.4 Breiman and ensemble intelligence
13.4.5 The convergence of foundational ideas
13.5 The limits of scaling
13.5.1 Physical and computational limits
13.5.2 Data constraints
13.5.3 Economic and technological limits
13.6 Closing perspectives: from foundations to frontier
13.7 Summary