1 Introducing Regularization
This chapter covers
- The what and why of regularization in deep learning
- Underfitting versus overfitting
- Bias and variance trade-off
- A typical model training process
- Different types of regularization
We all want to do well in exams. Since young, we have most likely been taught to study hard to achieve high marks, not only in the usual practice exam but also in the final exam. In the world of modeling, the practice exam is analogous to the training data given to us, where both questions and answers are available. We want to train a model based on the training data and use the trained model to make predictions for the test data, which is the final exam.
Our goal is to train an excellent mental model that works well for the practice exam and final exam. We know that the common problem-solving patterns will largely remain the same between the two exams, so it is essential to understand and learn these patterns during the practice exam in order to be able to do well in the final exam. Depending on our preparation strategy, our trained mental model could perform good or bad in either exam, ending up with one of the four possible outcomes: good in both exams, bad in both exams, good in the practice exam but unfortunately bad in the final exam, and bad in the practice exam but surprisingly good in the final exam. These four different exam outcomes are illustrated in figure 1.1.