chapter five

5 Agentic RAG

This chapter covers

What agentic RAG is
Why we need agentic RAG
How to implement agentic RAG

In earlier chapters, we saw how to find relevant data using different methods of vector similarity search. Using similarity search, we can find relevant data in unstructured data sources, but data with a structure can often bring more value over unstructured data because there’s information in the structure itself.

Adding structure to data can be an incremental process. We can start with a simple structure and then add more complex structures as we go. We saw this in the previous chapter, where we started with simple graph data and then added more complex structures to it.

An agentic RAG system (see figure 5.1) is a system where a variety of retrieval agents are available to retrieve the data needed to answer the user question. The starting interface to an agentic RAG system is usually a retriever router, whose job is to find the best-suited retriever (or retrievers) to perform the task at hand.

One common way to implement an agentic RAG system is to use an LLM’s ability to use tools (sometimes called function calling). Not all LLMs have this ability, but OpenAI’s GPT-3.5 and GPT-4 do, and that is what we will use in this chapter. This can be achieved with most LLMs using the ReAct approach (see https://arxiv.org/abs/2210.03629), but over time, the current trajectory is that this feature will be available in all LLMs.

5.1 What is agentic RAG?

5.1.1 Retriever agents

5 Agentic RAG

This chapter covers

5.1 What is agentic RAG?

5.1.1 Retriever agents

5.1.2 The retriever router

5.1.3 Answer critic

5.2 Why do we need agentic RAG?

5.3 How to implement agentic RAG

5.3.1 Implementing retriever tools

5.3.2 Implementing the retriever router

5.3.3 Implementing the answer critic

5.3.4 Tying it all together

Summary