In the previous chapter, we learned how to use A/B testing to evaluate changes to the system your engineering team is building. Once the tooling is in place to run A/B tests, the team should see a steady increase in the quality of the system as new changes follow the engineering workflow: implement a change candidate, evaluate it offline, and evaluate it online with an A/B test.
As the use of A/B testing increases, you’ll spend more time evaluating “B” versions that underperform the current system, “A.” (Recall from chapter 1 that most Bs underperform As.) Every time an underperforming B is measured, you pay a cost equal to the value of the business metric you would have achieved with A minus that achieved with B. If, for example, the business metric is revenue, then the cost of measuring B is “revenue of A - revenue of B.” It will make sense to look for ways to reduce this cost. You can’t just stop an A/B test early if it looks good because, although a shorter test would yield a lower cost, stopping early would produce false positives (see chapter 8, section 8.2).