Chapter 6. Searching with relevancy

 

This chapter covers

  • How scoring works inside Lucene and Elasticsearch
  • Boosting the score of a particular query or field
  • Understanding term frequency, inverse document frequency, and relevancy scores with the explain API
  • Reducing the impact of scoring by rescoring a subset of documents
  • Gaining ultimate power over scoring using the function_score query
  • The field data cache and how it affects Elasticsearch instances

In the world of free text, being able match a document to a query is a feature touted by many different storage and search engines. What really makes an Elasticsearch query different from doing a SELECT * FROM users WHERE name LIKE 'bob%' is the ability to assign a relevancy, also known as a score, to a document. From this score you know how relevant the document is to the original query.

When users type a query into a search box on a website, they expect to find not only results matching their query but also those results ranked based on how closely they match the query’s criteria. As it turns out, Elasticsearch is quite flexible when it comes to determining the relevancy of a document, and there are a lot of ways to customize your searches to provide more relevant results.

6.1. How scoring works in Elasticsearch

6.2. Other scoring methods

6.3. Boosting

6.4. Understanding how a document was scored with explain

6.5. Reducing scoring impact with query rescoring

6.6. Custom scoring with function_score

6.7. Tying it back together

6.8. Sorting with scripts

6.9. Field data detour

6.10. Summary

sitemap