5 Hyperparameter optimization service

 

This chapter covers

  • Hyperparameters and why they are important
  • Two common approaches to hyperparameter optimization (HPO)
  • Designing an HPO service
  • Three popular HPO libraries: Hyperopt, Optuna, and Ray Tune

In the previous two chapters, we saw how models are trained: a training service manages training processes in a remote compute cluster with given model algorithms. But model algorithms and training services aren’t all there is to model training. There’s one more component we haven’t discussed yet—hyperparameter optimization (HPO). Data scientists often overlook the fact that hyperparameter choices can influence model training results significantly, especially when these decisions can be automated using engineering methods.

Hyperparameters are parameters whose value must be set before the model training process starts. Learning rate, batch size, and number of hidden layers are all examples of hyperparameters. Unlike the value of model parameters—weights and bias, for example—hyperparameters cannot be learned during the training process.

Research reveals that the chosen value of hyperparameters can affect both the quality of the model training as well as the time and memory requirements of the training algorithm. Therefore, hyperparameters must be tuned to be optimal for model training. Nowadays, HPO has become a standard step in the deep learning model development process.

5.1 Understanding hyperparameters

5.1.1 What is a hyperparameter?

5.1.2 Why are hyperparameters important?

5.2 Understanding hyperparameter optimization

5.2.1 What is HPO?

5.2.2 Popular HPO algorithms

5.2.3 Common automatic HPO approaches

5.3 Designing an HPO service

5.3.1 HPO design principles

5.3.2 A general HPO service design

5.4 Open source HPO libraries

5.4.1 Hyperopt

5.4.2 Optuna