chapter eleven

11 Contextualizing prompts with retrieval augmented generation

This chapter covers

Outlining how retrieval augmented generation (RAG) works
Using tooling to create a basic RAG setup
Integrating vector databases into a RAG setup

As we learned in the previous chapter, one of the challenges of working with LLMs is that they lack visibility of our context. During part 2 of this book, we looked at different ways in which we can arrange our prompts to help provide small insights into our context. However, these types of prompts only have a certain amount of mileage before the lack of extra context leads to less valuable responses. Therefore, to increase the value of an LLM’s response, it stands to reason that we need to provide more contextual detail into our prompt. In this chapter, we’ll explore how we can do this through retrieval augmented generation, or RAG. We’ll learn how RAG works, why it’s of value to us, and how it’s not a big jump from prompt engineering to building our own RAG framework examples to establish our understanding of how they can help us in a testing context.

11.1 Extending prompts with RAG

11.2 Building a RAG setup

11.2.1 Building our RAG framework

11.2.2 Testing our RAG framework

11.3 Enhancing data storage for RAG

11.3.1 Working with Vector databases

11.3.2 Setting up a vector-database-backed RAG

11.3.3 Testing a Vector-database-backed RAG framework

11.3.4 Going forward with RAG frameworks

11.4 Summary