We saw in chapter 4 that by aiming to improve from the best value achieved so far, we can design improvement-based BayesOpt policies, such as Probability of Improvement (POI) and Expected Improvement (EI). In chapter 5, we used multi-armed bandit (MAB) policies to obtain Upper Confidence Bound (UCB) and Thompson sampling (TS), each of which uses a unique heuristic to balance exploration and exploitation in the search for the global optimum of the objective function.
In this chapter, we learn about another heuristic to decision-making, this time using information theory to design BayesOpt policies we can use in our optimization pipeline. Unlike the heuristics we have seen (seeking improvement, optimism in the face of uncertainty, and random sampling), which might seem unique to optimization-related tasks, information theory is a major subfield of mathematics that has applications in a wide range of topics. As we discuss in this chapter, by appealing to information theory or, more specifically, entropy, a quantity that measures uncertainty in terms of information, we can design BayesOpt policies that seek to reduce our uncertainty about the objective function to be optimized in a principled and mathematically elegant manner.