chapter four

4 Refining the best result with improvement-based policies

 

This chapter covers

  • The Bayesian optimization loop
  • The tradeoff between exploitation and exploration in a Bayesian optimization policy
  • Improvement as a criterion for finding new data points
  • Bayesian optimization policies that leverage improvement

In this chapter, we first remind ourselves of the iterative nature of BayesOpt: we alternate between training a Gaussian process on the collected data and finding the next data point to label using a BayesOpt policy. This forms a virtuous cycle where our past data inform future decisions. We then talk about what we look for in a BayesOpt policy, a decision-making algorithm that decides which data point to label. A BayesOpt policy needs to balance between sufficiently exploring the search space and zeroing in on the high-performing regions.

4.1.1 The Bayesian optimization loop and policies

4.1.2 Balancing exploration and exploitation

4.2 Finding improvement in Bayesian optimization

4.2.1 Measuring improvement with a Gaussian process

4.2.2 Computing the probability of improvement

4.2.3 Diagnosing the Probability of Improvement policy

4.2.4 Exercise 1: Encouraging exploration with Probability of Improvement

4.3 Optimizing the expected value of improvement

4.4 Summary

4.5 Exercise 2: Bayesian optimization for hyperparameter tuning