2 Working with Large Language Models
This chapter covers
- How LangChain builds on LLM capabilities
- What a LLM is and how it works
- Most common use cases
- How to adapt a LLM to user’s needs
- Main LLMs and how to choose one
To use LangChain proficiently, it's essential to have a basic understanding of how LLMs work. I'll explain the technology behind LLMs, including artificial neural networks, the transformer architecture, and the "attention" mechanism. As we progress, we’ll explore how LangChain maximizes the potential of LLMs with techniques like prompt engineering, Retrieval Augmented Generation (RAG), and fine-tuning, which involves creating refined versions of existing LLMs. Additionally, we’ll see how LangChain enables seamless interaction with both proprietary and open-source models. Let's begin with the fundamental question: how does an LLM work?
2.1 What is a Large Language Model?
A Large Language Model (LLM) is an AI model designed for tasks involving human language. These tasks include understanding, creating, summarizing, and translating text, mimicking human abilities. The key to making LLMs behave like humans is training them with vast amounts of text. They learn from articles, books, social media, and web content. The largest LLMs are trained on trillions of "tokens," which are pieces of text roughly like words (some words may be split into multiple tokens).