chapter three

3 Grounding Outputs with RAG

 

This chapter covers

  • Retrieval augmented generation (RAG) and how it overcomes limitations of stand-alone LLMs
  • Core components of a RAG architecture - retrievers, generators, orchestrators
  • Indexing and structuring knowledge sources to enable relevant passage retrieval
  • Building sample RAG systems with LangChain for simplifying orchestration

In our prior chapter, a world of possibilities opened in constructing conversational AI through prompting - carefully crafting input texts to large language models (LLMs) to shape helpful, eloquent chatbot responses. However, despite the disruptive potential, major gaps remain in flexibility for real-world assistance.

Figure 3.1 Basic prompting through interacting with LLM directly. [1]

With basic prompting (shown in figure 3.1), LLMs have no direct means to access live external data streams beyond their training corpora. However, allowing chatbots to incorporate dynamic knowledge is crucial, as we see below for our retail ecommerce chatbot. Could prompting alone enable a shopper asking:

Figure 3.2 Asking an LLM (Claude) about a specific question related to inventory systems

3.1 What is retrieval augmented generation?

3.2 RAG system architecture

3.2.1 The role of the retriever

3.2.2 The factual generators

3.2.3 RAG system flow

3.2.4 Example: E-commerce shopping assistant

3.3 Reducing hallucinations with RAG

3.3.1 Grounding the language model

3.3.2 Asking a question about climate change

3.3.3 Advantages of RAG in reducing hallucinations

3.3.4 Critical applications

3.3.5 Enhancing transparency

3.4 Data preparation and reliable indexing for RAG systems

3.4.1 Introduction to LangChain

3.4.2 Structuring product catalogs

3.4.3 Creating the searchable vector index

3.4.4 Implementing a FAISS Index

3.5 Building an effective RAG system

3.5.1 Building an e-commerce chatbot powered by RAG

3.5.2 Processing user queries

3.5.3 Crafting an accurate response using an LLM

3.6 Evaluating and optimizing RAG systems

3.6.1 The role of evaluator LLMs in assessing hallucinations

3.6.2 Crafting evaluation datasets

3.6.3 Practical metrics for RAG evaluation