13 Semantic search with dense vectors

This chapter covers

Semantic search using Large Language Model (LLM) embeddings
Representing the meaning of text with dense vectors
An introduction to Transformers, and their impact on text representation and retrieval
Building a fast and accurate autocomplete using Transformer models
Using approximate nearest neighbor (ANN) search to speed up dense vector retrieval

In this chapter, we’ll start our journey into the emerging future of search, where the hyper-contextual vectors generated by Large Language Models (LLMs) are driving significant improvements to interpretation of queries, documents, and search results. Further, generative LLMs (like ChatGPT by OpenAI and many other commercial and open-source alternatives) are also able to use these vectors to generate new content, including query expansion, search training data, and summarization of search results, as we’ll explore further in the coming chapters.

13.1 Language Translation as an Analogy for Text Representation

13.1.1 Representation of Meaning through Text Embeddings

13.2 Search using Dense Vectors

13.2.1 A brief refresher on sparse vectors

13.2.2 A conceptual dense vector search engine

13.3 Getting Text Embeddings by using a Transformer Encoder

13.3.1 What is a Transformer?

13.3.2 Openly available pre-trained transformer models

13.4 Applying Transformers to Search

13.4.1 Using the Outdoors Stack Exchange dataset

13.4.2 Fine-tuning and the Semantic Text Similarity Benchmark (STS-B)

13.4.3 Introducing SBERT, a transformer library built around similarity between sentences

13.5 Natural Language Autocomplete

13.5.1 Getting noun phrases and verb phrases for our nearest-neighbor vocabulary

13.5.2 Getting embeddings

13.5.3 Approximate Nearest-Neighbor search

13.7 Summary