14 Building retrieval-augmented generation AI chatbots
This chapter covers
- Experiencing large language model hallucinations
- Gaining insight into retrieval-augmented generation and MongoDB
- Localizing Atlas Vector Search within RAG
- Orchestrating the RAG pattern with LangChain
- Building a generative AI chatbot
- Playing with the LangServe playground
Large language model (LMM) hallucinations occur when the model generates information that isn’t based on facts or given inputs. These errors can be made-up details, wrong facts, or believable but incorrect responses. They happen because LMMs like GPT-4 generate text from patterns they learned during training, not by checking facts. As a result, they may produce content that looks right but isn’t accurate. Reducing these mistakes is important for the reliability of LMMs. Methods include improving training data quality, using real-time fact-checking, adding better verification systems, and using retrieval-augmented generation (RAG), which combines generating text with real-time information retrieval to improve accuracy. MongoDB Atlas Vector Search can serve as a key component for storing and retrieving data that RAG systems rely on, ensuring that LLMs have access to accurate, up-to-date information during the generation process.