8 Signals boosting models
This chapter covers
- Aggregating user signals to create a popularity-based ranking model
- Normalizing signals to best enhance relevance for noisy query input
- Fighting signal spam and user manipulation of crowdsourced signals
- Applying time decays to prioritize recent signals as more relevant
- Blending multiple signal types together into a unified signals boosting model
- Balancing flexibility and performance using query time vs. index-time signals boosting.
In Chapter 4, we covered three different categories of reflected intelligence: Signals Boosting (popularized relevance), Collaborative Filtering (personalized relevance), and Learning to Rank (generalized relevance). In this chapter, we’ll dive deeper into the first of these, implementing Signals Boosting to enhance the relevance ranking of your most popular queries and documents.
In most search engines, you will find that a relatively small number of queries tend to make up a large portion of your total query volume. These popular queries, called head queries, also tend to lead to more signals (such as clicks and purchases in an e-commerce use case), which enable stronger inferences about the popularity of top search results.
Signals boosting models directly harness these stronger inferences and are the key to ensuring your most important and highest-visibility queries are best tuned to return the most relevant documents.