5 Fundamentals of machine learning

This chapter covers

Understanding the tension between generalization and optimization, the fundamental issue in machine learning
Evaluation methods for machine learning models
Best practices to improve model fitting
Best practices to achieve better generalization

After the three practical examples in chapter 4, you should be starting to feel familiar with how to approach classification and regression problems using neural networks, and you’ve witnessed the central problem of machine learning: overfitting. This chapter will formalize some of your new intuition about machine learning into a solid conceptual framework, highlighting the importance of accurate model evaluation and the balance between training and generalization.

5.1 Generalization: The goal of machine learning

In the three examples presented in chapter 4—predicting movie reviews, topic classification, and house-price regression—we split the data into a training set, a validation set, and a test set. The reason not to evaluate the models on the same data they were trained on quickly became evident: after just a few epochs, performance on never-before-seen data started diverging from performance on the training data, which always improves as training progresses. The models started to overfit. Overfitting happens in every machine learning problem.

5.1.1 Underfitting and overfitting

5.1.2 The nature of generalization in deep learning

5.2 Evaluating machine learning models

5.2.1 Training, validation, and test sets

5.2.2 Beating a common-sense baseline

5.2.3 Things to keep in mind about model evaluation

5.3 Improving model fit

5.3.1 Tuning key gradient descent parameters

5.3.2 Leveraging better architecture priors

5.3.3 Increasing model capacity

5.4 Improving generalization

5.4.1 Dataset curation

Summary