10 Learning to rank for generalizable search relevance

This chapter covers:

Using machine learning to build generalizable search systems
Ranking within the search engine using machine learning models
How learning to rank is different from other machine learning methods
Building a robust and generalizable ranking model

It’s a random Tuesday. You review your search logs. The searches range from the frustrated runner’s - polar m430 running watch charger - to the worried hypochondriac’s - weird bump on nose - cancer? - to the curious cinephile’s - william shatner first film. Despite the fact that many are one-off queries, you know each user expects nothing less than amazing search results.

You feel hopeless. You know many query strings, by themselves, are distressingly rare. You have very little click data to know what’s relevant for these searches. Every day gets more challenging: trends, use cases, products, user interfaces, and even languages evolve. How can anyone hope to build search that amazes when users seem to constantly surprise us with new ways of searching?

10.1 The limits of collaborative filtering ranking

10.2 Learning to Rank: generalized optimization of relevance

10.2.1 Implementing learning to rank in the real-world

10.3 Step 1: A judgment list, starting with ground truth

10.4 Step 2 - feature logging and engineering

10.4.1 Storing features in a modern search engine

10.4.2 Logging features from our Solr corpus

10.5 Step 3 - transforming LTR to a traditional machine learning problem

10.5.1 SVMRank: Transforming ranking to binary classification

10.5.2 Transforming our LTR training data to binary classification

10.6 Step 4—Training (and testing!) the model

10.6.1 Turning a separating hyperplane’s vector into a scoring function

10.6.2 Taking the model for a test drive

10.6.3 Validating the model

10.7 Steps 5 and 6 - upload a model and search

10.7.1 A note on LTR performance

10.8 Rinse and repeat

10.9 Summary