chapter five

5 RAG fundamentals with Chroma DB

This chapter covers

Implementing semantic search using the RAG architecture
Understanding vector stores and their functionality
Implementing RAG with Chroma DB and OpenAI

In this chapter, you’ll dive into two essential concepts: semantic search and Retrieval Augmented Generation (RAG). You’ll explore how large language models (LLMs) are used for semantic search through a chatbot, enabling you to query a system for information across multiple documents and retrieve the fragments that best match the meaning of your question, rather than just matching keywords. This approach is also known as Q&A over documents or querying a knowledge base.

Earlier chapters introduced you to classic NLP solutions like the LLM engine, especially for summarization. Now, you’ll learn the basics of building a Q&A chatbot that searches across multiple documents by interacting with the LLM until you find satisfactory results.

This chapter focuses on RAG, the design pattern that powers semantic search systems, with a particular emphasis on the vector store—a key component of these systems. You’ll learn the technical terminology related to Q&A and RAG systems and understand how terms like "semantic search" and "Q&A" are often used interchangeably.

5.1 Semantic Search

5.1.1 A Basic Q&A Chatbot Over a Single Document

5.1.2 A More Complex Q&A Chatbot Over a Knowledge Base

5.1.3 The RAG Design Pattern

5.2 Vector Stores

5 RAG fundamentals with Chroma DB

This chapter covers

5.1 Semantic Search

5.1.1 A Basic Q&A Chatbot Over a Single Document

5.1.2 A More Complex Q&A Chatbot Over a Knowledge Base

5.1.3 The RAG Design Pattern

5.2 Vector Stores

5.2.1 What’s a Vector Store?

5.2.2 How Do Vector Stores Work?

5.2.3 Vector Libraries vs. Vector Databases

5.2.4 Most Popular Vector Stores

5.2.5 Storing Text and Performing a Semantic Search Using Chroma

5.3 Implementing RAG from Scratch

5.3.1 Retrieving Content from the Vector Database

5.3.2 Invoking the LLM

5.3.3 Building the Chatbot

5.3.4 Recap on RAG Terminology

5.4 Summary