chapter four

4 Training Fundamentals

This chapter covers

Forward feeding and backward propagation
Splitting datasets and data pre-processing
Using validation data to monitor overfitting Using checkpointing and early stop for more economical training
Hyperparameters vs model parameters
Training for invariance to location and scale
Assembling and accessing on-disk datasets
Saving and then restoring a trained model

In this chapter, we will cover the fundamentals of training a model. Prior to 2019, the majority of models were trained according to a set of fundamental steps which we will cover in this chapter. Consider this chapter as a foundation.

In this chapter we cover methods, techniques and best practices developed over time by experimentation and trial and error. We will start first reviewing forward feeding and backward propagation. While the concept and practice pre-existed deep learning, it took numerous refinements over the years to make model training practical; specifically in how we split the data, how we feed it and then how we update weights using gradient descent during backward propagation. These refinement in techniques provided the means to train models to convergence, where the accuracy of the model to predict would plateau. Other training techniques in data preprocessing and augmentation were developed to push convergence to higher plateaus, and aid models into better generalizing to data that the model was not trained on.

4.1 Forward Feed and Backward Propagation

4.1.1 Feeding

4.1.2 Backward Propagation

4.2 Dataset Splitting

4.2.1 Train and Test Sets

4.2.2 One-Hot Encoding

4.3 Data Normalization

4.3.1 Normalization

4.3.2 Standardization

4.4 Validation & Overfitting

4.4.1 Validation

4.4.2 Loss Monitoring

4.4.3 Going Deeper with Layers

4.5 Convergence

4.6 Checkpointing and Early Stopping

4.6.1 Checkpointing

4.6.2 Early Stopping

4.7 Hyperparameters

4.7.1 Epochs

4.7.2 Steps

4.7.3 Batch Size

4.7.4 Learning Rate

4.8 Invariance

4.8.1 Translational Invariance

4.8.2 Scale Invariance

4.8.3 TF.Keras ImageDataGenerator

4.9 Raw (Disk) Datasets

4.9.1 Directory Structure

4.9.2 CSV File