chapter eleven

11 Automating learning to rank with click models

This chapter covers

Automating learning to rank retraining from user behavioral signals (searches, clicks, etc.)
Transforming user signals into implicit LTR training data using click models
Overcoming user’s tendency to click items higher in the search results, regardless of relevance
Handling low-confidence documents with fewer clicks when deriving implicit judgments

In chapter 10, we went step by step through training a learning to rank (LTR) model. Like walking through the mechanics of building a car, we saw the underlying nuts and bolts of LTR model training. In this chapter, we’ll treat the LTR training process as a black box. In other words, we’ll step away from the LTR internals, instead treating LTR more like a self-driving car, fine-tuning its trip toward a final destination.

Recall that LTR relies on accurate training data in order to be effective. LTR training data describes how users expect search results to be optimally ranked; it provides the directions we’ll input into our LTR self-driving car. As you’ll see, determining what’s relevant based on user interactions comes with many challenges. If we can overcome these challenges and gain high confidence in our training data, though, we can build automated learning to rank: a system that regularly retrains LTR to capture the latest user relevance expectations.

11.1 (Re)creating judgment lists from signals

11.1.1 Generating implicit, probabilistic judgments from signals

11.1.2 Training an LTR model using probabilistic judgments

11.1.3 Click-Through Rate: Your first click model

11.1.4 Common biases in judgments

11.2 Overcoming position bias

11.2.1 Defining position bias

11.2.2 Position bias in RetroTech data

11.2.3 Simplified dynamic Bayesian network: A click model that overcomes position bias

11.3 Handling confidence bias: Not upending your model due to a few lucky clicks

11.3.1 The low-confidence problem in click data

11.3.2 Using a beta prior to model confidence probabilistically

11.4 Exploring your training data in an LTR system

Summary