chapter seven

7 Evolving RAGOps Stack: Technologies that make RAG possible

This chapter covers

The design of RAG systems
Available tools and technologies that enable a RAG system
Production best practices for RAG systems

So far in the book we have discussed the indexing pipeline, the generation pipeline and the evaluation of a RAG system. In Chapter 6 we also covered some advanced strategies and techniques that are useful when building production-grade RAG systems. These help in improving the accuracy of retrieval and generation and, in some cases, reducing the latency of the system. With all this information, you should be able to stitch together a RAG system for your use cases. In Chapter 2, we had briefly laid out the design of such a RAG system. In this chapter we will elaborate on the design.

A RAG system is composed of standard application layers as well as layers specific to generative AI applications. All these layers, stacked together, create a robust RAG system.

These layers are supported by a technology infrastructure. We will delve into these layers and the available technologies & tools offered by popular service providers that can be leveraged in crafting a RAG system. Some providers have started offering managed end-to-end RAG solutions which we will touch upon in this chapter.

We will wrap up the chapter with some learnings and best practices of putting RAG systems in production. This chapter will mark the end of part 3 of the book.

7 Evolving RAGOps Stack: Technologies that make RAG possible

This chapter covers

7.1 The Evolving RAGOps Stack

7.1.1 Critical Layers

7.1.2 Essential Layers

7.1.3 Enhancement Layers

7.2 Production Best Practices

7.3 Summary