chapter six

6 Retrieval-Augmented Generation

This chapter covers

Basics of Retrieval Augmented Generation (RAG): What is it, why is it used, and what benefits do we gain from RAG?
A deep insight into how prompt techniques, elements, and patterns converge with RAG.
Deep dive into the architecture and workflow of RAG and its components.
Real-world examples leveraging the RAG workflow in its different phases.

Chapter 5 discussed Automatic and tree-of-thought prompting techniques. These techniques leverage chain-of-thought prompting while mitigating its limitations. We saw how graph-of-thought prompting can help solve problems that require interconnected reasoning. Furthermore, we saw how generated knowledge prompting can help leverage knowledge generation for smaller sub-problems while moving towards solving the main problem. Lastly, we saw the field of Automatic Prompt Engineering, which uses models to generate prompts, modify them based on feedback, and use prompts at scale.

In this Chapter, we will examine how Retrieval-Augmented Generation (RAG) plays an important role in interacting with models. We will discuss RAG's overall architecture, prompt engineering's role, retrieval methodologies' benefits and uses within RAG, the use cases where RAG is helpful, and the advantages and disadvantages of building a RAG-based system to interact with models. To start, let’s understand what RAG is.

6.1 What is Retrieval-Augmented Generation

6.2 How does a RAG system work?

6.2.1 Pre-Retrieval

6.2.2 Retrieval

6.2.3 Post-Retrieval

6.3 Prompt Engineering & RAG

6.3.1 Improving query formulation during retrieval

6.3.2 Standardizing Information Presentation

6.3.3 Optimize Multi-Step Reasoning in RAG

6.4 Architectural Overview

6.4.1 Retriever

6.4.2 Generator

6.4.3 Augmentation

6.5 Stages of a RAG Pipeline

6.5.1 Pre-Retrieval

6.5.2 Retrieval

6.5.3 Post Retrieval

6.6 Summary