6 RAG fundamentals with Chroma DB

 

This chapter covers

  • Implementing semantic search using the RAG architecture
  • Understanding vector stores and their functionality
  • Implementing RAG with Chroma DB and OpenAI

In this chapter, you’ll dive into two essential concepts: semantic search and Retrieval Augmented Generation (RAG). You’ll explore how large language models (LLMs) are used for semantic search through a chatbot, enabling you to query a system for information across multiple documents and retrieve the fragments that best match the meaning of your question, rather than just matching keywords. This approach is also known as Q&A over documents or querying a knowledge base.

In earlier chapters, you learned about summarization, a typical use case for large language models. Now, I'll walk you through the basics of building a Q&A chatbot that searches across multiple documents. You'll interact with the LLM to find the answers you're looking for.

This chapter focuses on RAG, the design pattern that powers semantic search systems, with a particular emphasis on the vector store—a key component of these systems. You’ll learn the technical terminology related to Q&A and RAG systems and understand how terms like "semantic search" and "Q&A" are often used interchangeably.

6.1 Semantic Search

6.1.1 A Basic Q&A Chatbot Over a Single Document

6.1.2 A More Complex Q&A Chatbot Over a Knowledge Base

6.1.3 The RAG Design Pattern

6.2 Vector Stores

6.2.1 What’s a Vector Store?

6.2.2 How Do Vector Stores Work?

6.2.3 Vector Libraries vs. Vector Databases

6.2.4 Most Popular Vector Stores

6.2.5 Storing Text and Performing a Semantic Search Using Chroma

6.3 Implementing RAG from Scratch

6.3.1 Retrieving Content from the Vector Database

6.3.2 Invoking the LLM

6.3.3 Building the Chatbot

6.3.4 Recap on RAG Terminology

6.4 Summary