5 Agentic RAG

This chapter covers

What is agentic RAG?
Why do we need agentic RAG?
How to implement agentic RAG

In earlier chapters we have seen how to find relevant data using different methods of vector similarity search. Using similarity search we can find relevant data in unstructured data sources, but data with a structure can often bring more value over unstructured data because there’s information in the structure itself.

Adding structure to data can be an incremental process. We can start with a simple structure and then add more complex structures as we go. We saw this in the previous chapter where we started with simple graph data and then added more complex structures to it.

An agentic RAG system is a system where a variety of retrieval agents are available to retrieve the data needed to answer the user question.

The starting interface to an agentic RAG system is usually a retriever router which job is to find the best suited retriever (or retrievers) to perform the task at hand.

One common way to implement an agentic RAG system is to use LLM’s ability to use tools (sometimes called function calling). Not all LLMs have this ability, but OpenAI’s GPT-3.5 and GPT-4 have this ability and is what we will use in this chapter. This can be achieved with most LLMs using the ReAct approach but over time the current trajectory is that this feature will be available in all LLMs.

Figure 5.1 The data flow for a application using the Agentic RAG.

5.1 What is agentic RAG?

5.1.1 Query Rewriter

5 Agentic RAG

This chapter covers

Figure 5.1 The data flow for a application using the Agentic RAG.

5.1 What is agentic RAG?

5.1.1 Query Rewriter

5.1.2 Retriever Agents

5.1.3 The Retriever Router

5.1.4 Answer Critic

5.2 Why do we need agentic RAG?

5.3 How to implement agentic RAG

5.3.1 Implementing the Query Rewriter

5.3.2 Implementing Retriever Agents

5.3.3 Implementing the Retriever Router

5.3.4 Implementing the Answer Critic

5.3.5 Tieing it all together

5.4 Summary

5.5 References