13 Semantic search with dense vectors

 

This chapter covers:

  • Representing the meaning of text with dense vectors
  • An introduction to Transformers, and their impact on text representation and retrieval
  • Building a fast and accurate autocomplete using transformer models
  • Using approximate nearest neighbor (ANN) search to speed up dense vector retrieval
  • Semantic-search using dense vectors

In this chapter, we’ll start our journey into the emerging future of search, where we see the swell wave of hypercontectual vectors soak into the beaches of information retrieval.

Our story begins with what you have already learned in section 2.5, that we can represent context as numerical vectors, and we can compare these vectors to see which are closer using a similarity metric. In chapter 2 we demonstrated the concept of searching on dense vectors, a technique known as "dense vector search", but our examples were simple and contrived (searching on made-up food attributes). In this chapter we pose the question - how can we convert real world unstructured text into a high dimensional dense vector space that attempts to model the actual meaning of the text representation. And how can we leverage this representation of knowledge for advanced search applications?

13.1 Language Translation as an Analogy for Text Representation

 
 

13.1.1 Representation of Meaning through Text Embeddings

 
 
 

13.2 Search using Dense Vectors

 
 

13.2.1 A brief refresher on sparse vectors

 
 
 

13.2.2 A conceptual dense vector search engine

 

13.3 Getting Text Embeddings by using using a Transformer Encoder

 
 
 

13.3.1 What is a Transformer?

 
 

13.3.2 Openly available pre-trained transformer models

 
 
 

13.4 Applying Transformers to Search

 
 
 

13.4.1 Using the Outdoors StackExchange dataset

 
 

13.4.2 Fine-tuning and the Semantic Text Similarity Benchmark (STS-B)

 
 
 

13.4.3 Introducing SBERT, a transformer library built around similarity between sentences

 

13.5 Natural Language Autocomplete

 
 

13.5.1 Getting noun phrases and verb phrases for our nearest-neighbor vocabulary

 
 

13.5.2 Getting embeddings

 
 
 

13.5.3 Approximate Nearest-Neighbor search

 

13.5.4 Approximate Nearest-Neighbor index implementation

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest