chapter four

4 Training fundamentals

This chapter covers

Forward feeding and backward propagation
Splitting datasets and preprocessing data
Using validation data to monitor overfitting
Using checkpointing and early stopping for more-economical training
Using hyperparameters versus model parameters
Training for invariance to location and scale
Assembling and accessing on-disk datasets
Saving and then restoring a trained model

This chapter covers the fundamentals of training a model. Prior to 2019, the majority of models were trained according to this set of fundamental steps. Consider this chapter as a foundation.

In this chapter, we cover methods, techniques, and best practices developed over time by experimentation and trial and error. We will start by reviewing forward feeding and backward propagation. While these concepts and practices pre-existed deep learning, numerous refinements over the years made model training practical—specifically, in the way we split the data, feed it, and then update weights using gradient descent during backward propagation. These technique refinements provided the means to train models to convergence, the point where the accuracy of the model to predict would plateau.

4.1 Forward feeding and backward propagation

4.1.1 Feeding

4.1.2 Backward propagation

4 Training fundamentals

This chapter covers

4.1 Forward feeding and backward propagation

4.1.1 Feeding

4.1.2 Backward propagation

4.2 Dataset splitting

4.2.1 Training and test sets

4.2.2 One-hot encoding

4.3 Data normalization

4.3.1 Normalization

4.3.2 Standardization

4.4 Validation and overfitting

4.4.1 Validation

4.4.2 Loss monitoring