6 Retrieval Augmented Generation (RAG)
This chapter covers:
- Deep dive into Retrieval Augmented Generation (RAG)
- Combining Encoders and Decoders to generate personalized search results
- The need for chunking larger documents
- Using a Vector Database instead of FAISS
- Evaluating performance of search results and RAG
We’ve explored two powerful paradigms in modern AI: semantic search using encoder models and text generation using decoder models (LLMs). While each of these technologies is impressive, their true potential emerges when we combine them. This combination, known as Retrieval Augmented Generation (RAG), represents a significant advancement in building intelligent systems that are both knowledgeable and articulate.
As we discussed in Chapter 5, the fundamental challenge with large language models is their tendency to hallucinate or generate plausible sounding but potentially incorrect information. While they excel at understanding and generating human-like text, they lack access to current, verified information about specific domains. This limitation becomes particularly apparent in applications like our Travelle hotel search system, where accuracy and up-to-date information are crucial for providing valuable recommendations to users, for instance ChatGPT now offers integrated web and deep research.