4 Training fundamentals

 

This chapter covers

  • Forward feeding and backward propagation
  • Splitting datasets and preprocessing data
  • Using validation data to monitor overfitting
  • Using checkpointing and early stopping for more-economical training
  • Using hyperparameters versus model parameters
  • Training for invariance to location and scale
  • Assembling and accessing on-disk datasets
  • Saving and then restoring a trained model

This chapter covers the fundamentals of training a model. Prior to 2019, the majority of models were trained according to this set of fundamental steps. Consider this chapter as a foundation.

In this chapter, we cover methods, techniques, and best practices developed over time by experimentation and trial and error. We will start by reviewing forward feeding and backward propagation. While these concepts and practices pre-existed deep learning, numerous refinements over the years made model training practical—specifically, in the way we split the data, feed it, and then update weights using gradient descent during backward propagation. These technique refinements provided the means to train models to convergence, the point where the accuracy of the model to predict would plateau.

4.1 Forward feeding and backward propagation

4.1.1 Feeding

4.1.2 Backward propagation

4.2 Dataset splitting

4.2.1 Training and test sets

4.2.2 One-hot encoding

4.3 Data normalization

4.3.1 Normalization

4.3.2 Standardization

4.4 Validation and overfitting

4.4.1 Validation

4.4.2 Loss monitoring