7 Evolving RAGOps Stack: Technologies that make RAG possible
This chapter covers
- The design of RAG systems
- Available tools and technologies that enable a RAG system
- Production best practices for RAG systems
So far in the book we have discussed the indexing pipeline, the generation pipeline and the evaluation of a RAG system. In Chapter 6 we also covered some advanced strategies and techniques that are useful when building production-grade RAG systems. These help in improving the accuracy of retrieval and generation and, in some cases, reducing the latency of the system. With all this information, you should be able to stitch together a RAG system for your use cases. In Chapter 2, we had briefly laid out the design of such a RAG system. In this chapter we will elaborate on the design.
A RAG system is composed of standard application layers as well as layers specific to generative AI applications. All these layers, stacked together, create a robust RAG system.
These layers are supported by a technology infrastructure. We will delve into these layers and the available technologies & tools offered by popular service providers that can be leveraged in crafting a RAG system. Some providers have started offering managed end-to-end RAG solutions which we will touch upon in this chapter.
We will wrap up the chapter with some learnings and best practices of putting RAG systems in production. This chapter will mark the end of part 3 of the book.
By the end of this chapter, you should –