chapter four

4 Refining the best result with improvement-based policies

This chapter covers

The Bayesian optimization loop
The tradeoff between exploitation and exploration in a Bayesian optimization policy
Improvement as a criterion for finding new data points
Bayesian optimization policies that leverage improvement

In this chapter, we first remind ourselves of the iterative nature of BayesOpt: we alternate between training a Gaussian process on the collected data and finding the next data point to label using a BayesOpt policy. This forms a virtuous cycle where our past data inform future decisions. We then talk about what we look for in a BayesOpt policy, a decision-making algorithm that decides which data point to label. A BayesOpt policy needs to balance between sufficiently exploring the search space and zeroing in on the high-performing regions.

4.1 Navigating the search space in Bayesian optimization

4.1.1 The Bayesian optimization loop and policies

4.1.2 Balancing exploration and exploitation

4.2 Finding improvement in Bayesian optimization

4.2.1 Measuring improvement with a Gaussian process

4.2.2 Computing the probability of improvement

4.2.3 Diagnosing the Probability of Improvement policy

4.2.4 Exercise 1: Encouraging exploration with Probability of Improvement

4.3 Optimizing the expected value of improvement

4.4 Summary

4.5 Exercise 2: Bayesian optimization for hyperparameter tuning