10 Hyperparameter Tuning
This chapter covers
- Initializing the weights in a model prior to training warmup training
- Doing hyperparameter search manually and automatically
- Constructing a learning rate scheduler for training a model
- Regularizing a model during training
Hyperparameter tuning is the process of finding the optimal settings of the training hyperparameters,so that we minimize the training time and maximize the test accuracy.
Usually these two objectives can’t be fully optimized. That is, if we minimize the training time we likely will not achieve the best accuracy. Likewise, if we maximize the test accuracy we likely will need longer to train.
Tuning is finding the combination of hyperparameter settings that meet your targets for the objectives. For example, if your target is the highest possible accuracy, you may not concern yourself with minimizing the training time. In another situation, if you only need good (but not best) accuracy, and you are continuously retraining, you may want to find settings that get this good accuracy while minimizing the training time.