chapter seven

7 Retrieval-augment generation – the secret weapon

This chapter covers

Introducing concepts of Retrieval-Augment Generation (RAG)
Benefits of the RAG architecture in conjunction with LLMs
Understanding the role of vector databases and indexes in implementing RAG
Basics of vector search and understanding the distance functions
Challenges in RAG implementation and potential solutions
Delving into different methods of chunking text for RAG

Large language models, as we have seen, are very powerful and help us achieve things that, until very recently, were not possible. Interestingly, these large language models capture the world's knowledge and are available to any of us at the end of an API, anywhere in the world.

However, LLMs have a knowledge constraint wherein their understanding and knowledge are up to their last training cut-off; post that date, they do not have any new information. As a result, LLMs cannot utilize the latest information. In addition, the training corpus of LLMs does not contain any private non-public knowledge. As a result, LLMs cannot operate and answer specific and proprietary questions to enterprises.

7.1 What is Retrieval-Augmented Generation (RAG)?

7.2 Benefits of RAG

7.3 RAG Architecture

7.3.1 Retriever System

7.4 Understanding Vector Database

7.4.1 What is a Vector Index?

7.4.2 Vector Search

7.5 Challenges with RAG

7.6 Overcoming Challenges for Chunking

7.6.1 Chunking Strategies

7.6.2 Handling Unknown Complexities

7.6.3 Chunking sentences

7.6.4 Chunking Using Natural Language Processing

7.7 Chunking PDFs

7.8 Summary

7.9 References