10 Learning to Rank

 

This chapter covers:

  • Using machine learning to build generalized relevance ranking
  • Ranking within the search engine using machine learning models
  • How learning to rank is different from other machine learning methods
  • Building a robust and generalizable ranking model

In this chapter, we’ll explore Learning to Rank (LTR): using machine learning to create a generalized ranking function. We’ll start by seeing where LTR compares to solutions in previous chapters. We’ll then begin our explorations with simple models using Solr’s LTR capabilities, walking through the steps of training and ranking search results with an LTR model. Finally, we’ll close with discussion of the different choices and options along the path to performing LTR.

10.1  The Limits of Collaborative Filtering Ranking

We’ve seen from chapter 4 we can use collaborative filtering to predict which documents are likely to satisfy specific queries, based on similar queries. Consider the two red shoe and scarlett shoes queries in Table 10.1:

Table 10.1. Comparing the success of different products between the query red shoe and the query scarlett shoes
Product q=Red Shoe q=Scarlett Shoes

CTR=0.9

CTR=0.9

CTR=0.01

CTR=0.01

CTR=0.5

(Not Returned)

CTR=0.01

CTR=0.01

10.2  Learning to Rank: Generalized Optimization of Relevance

10.2.1  Practical Learning to Rank

10.3  Step 1: A Judgment List, Starting with Ground Truth

10.4  Step 2 - Feature Logging

10.4.1  Storing Features

10.4.2  Logging Features from our Corpus

10.5  Step 3 - Transforming LTR to a Traditional Machine Learning Problem

10.5.1  SVMRank: Transforming Ranking to Binary Classification

10.5.2  Transforming our LTR Training Data to Binary Classification

10.6  Step 4 — Training (and testing!) the model

10.6.3  Validating the Model