12 Overcoming ranking bias through active learning

This chapter covers

Harnessing live user interactions to gather feedback on a deployed LTR model
A/B testing search relevance solutions with live users
Using active learning to explore potentially relevant results beyond the top results
Balancing exploiting user interactions while exploring what else might be relevant

So far, our learning to rank (LTR) work has taken place in the lab. In previous chapters, we built models using automatically constructed training data from user clicks. In this chapter, we’ll take our model into the real world for a test drive with (simulated) live users!

Recall that we compared an automated LTR system to a self-driving car. Internally, the car has an engine: the end-to-end model retraining on historical judgments as discussed in chapter 10. In chapter 11, we compared our model’s training data to self-driving car directions: what should we optimize to automatically learn judgments based on previous interactions with search results? We built training data and overcame key biases inherent in click data.

12.1 Our automated LTR engine in a few lines of code

12.1.1 Turning clicks into training data (chapter 11 in one line of code)

12.1.2 Model training and evaluation in a few function calls

12.2 A/B testing a new model

12.2.1 Taking a better model out for a test drive

12.2.2 Defining an A/B test in the context of automated LTR

12.2.3 Graduating the better model into an A/B test

12.2.4 When “good” models go bad: What we can learn from a failed A/B test

12.3 Overcoming presentation bias: Knowing when to explore vs. exploit

12.3.1 Presentation bias in the RetroTech training data

12.3.2 Beyond the ad hoc: Thoughtfully exploring with a Gaussian process

12.3.3 Examining the outcome of our explorations

12.4 Exploit, explore, gather, rinse, repeat: A robust automated LTR loop

Summary