3 Multi-armed bandits: Maximizing business metrics while experimenting

 

This chapter covers

  • Defining the multi-armed bandit (MAB) problem
  • Modifying A/B testing’s randomization procedure
  • Extending epsilon-greedy to simultaneously evaluate multiple system changes
  • Evaluating system changes even more quickly with Thompson sampling

In the previous chapter, we learned how to use A/B testing to evaluate changes to the system your engineering team is building. Once the tooling is in place to run A/B tests, the team should see a steady increase in the quality of the system as new changes follow the engineering workflow: implement a change candidate, evaluate it offline, and evaluate it online with an A/B test.

As the use of A/B testing increases, you’ll spend more time evaluating “B” versions that underperform the current system, “A.” (Recall from chapter 1 that most Bs underperform As.) Every time an underperforming B is measured, you pay a cost equal to the value of the business metric you would have achieved with A minus that achieved with B. If, for example, the business metric is revenue, then the cost of measuring B is “revenue of A - revenue of B.” It will make sense to look for ways to reduce this cost. You can’t just stop an A/B test early if it looks good because, although a shorter test would yield a lower cost, stopping early would produce false positives (see chapter 8, section 8.2).

3.1 Epsilon-greedy: Account for the impact of evaluation on business metrics

 
 

3.1.1 A/B testing as a baseline

 

3.1.2 The epsilon-greedy algorithm

 

3.1.3 Deciding when to stop

 
 
 

3.2 Evaluating multiple system changes simultaneously

 
 
 

3.3 Thompson sampling: A more efficient MAB algorithm

 

3.3.1 Estimate the probability that an arm is the best

 
 
 
 

3.3.2 Randomized probability matching

 
 
 

3.3.3 The complete algorithm

 
 
 

Summary

 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest