2 The brain of AI agents: LLMs
This chapter covers
- LLM's core capabilities
- Selecting the right LLM
- Using LLM APIs
- Prompt engineering techniques
- Hands-on problem solving using a GAIA benchmark problem
Throughout this book, we’ll build a Research Agent and use it as a concrete thread to ground the concepts. In our case, the RA needs to interpret requests like “survey recent work on X” or “extract key findings from these PDFs” and make decisions like which sources to trust or when to ask clarifying questions.
Let’s turn our attention to the LLM—the "brain" of an LLM agent—shown as the core component in figure 2.1. As shown in the diagram, the LLM acts as the reasoning engine that powers the entire agent system. It interprets user requests, orchestrates interactions with tools (component 2), and drives the agent's decision-making process. Together with tools, the LLM forms the foundation of a basic agent (component 3), which we'll construct throughout these initial chapters.
Figure 2.1 The LLM serves as the reasoning engine for AI agents.
