13 Semantic search with dense vectors

This chapter covers

Semantic search using embeddings from LLMs
An introduction to Transformers, and their effect on text representation and retrieval
Building autocomplete using Transformer models
Using ANN search and vector quantization to speed up dense vector retrieval
Semantic search with bi-encoders and cross-encoders

In this chapter, we’ll start our journey into dense vector search, where the hyper-contextual vectors generated by large language models (LLMs) drive significant improvements to the interpretation of queries, documents, and search results. Generative LLMs (like ChatGPT by OpenAI and many other commercial and open source alternatives) are also able to use these vectors to generate new content, including query expansion, search training data, and summarization of search results, as we’ll explore further in the coming chapters.

13.1 Representation of meaning through embeddings

13.2 Search using dense vectors

13.2.1 A brief refresher on sparse vectors

13.2.2 A conceptual dense vector search engine

13.3 Getting text embeddings by using a Transformer encoder

13.3.1 What is a Transformer?

13.3.2 Openly available pretrained Transformer models

13.4 Applying Transformers to search

13.4.1 Using the Stack Exchange outdoors dataset

13.4.2 Fine-tuning and the Semantic Text Similarity Benchmark

13.4.3 Introducing the SBERT Transformer library

13.5 Natural language autocomplete

13.5.1 Getting noun and verb phrases for our nearest-neighbor vocabulary

13.5.2 Getting embeddings

13.5.3 ANN search

13.5.4 ANN index implementation

13.6 Semantic search with LLM embeddings