chapter four
4 Refining the best result with improvement-based policies
This chapter covers
- The Bayesian optimization loop
- The tradeoff between exploitation and exploration in a Bayesian optimization policy
- Improvement as a criterion for finding new data points
- Bayesian optimization policies that leverage improvement
In this chapter, we first remind ourselves of the iterative nature of BayesOpt: we alternate between training a Gaussian process on the collected data and finding the next data point to label using a BayesOpt policy. This forms a virtuous cycle where our past data inform future decisions. We then talk about what we look for in a BayesOpt policy, a decision-making algorithm that decides which data point to label. A BayesOpt policy needs to balance between sufficiently exploring the search space and zeroing in on the high-performing regions.