chapter six

6 Adding memory to your agent

This chapter covers

The role of memory in LLM agents
Managing context growth with sliding window, compaction, and summarization
Implementing sessions for multi-turn conversations
Building asynchronous human-in-the-loop workflows
Creating long-term memory for cross-session knowledge retention

Memory is what separates a stateless tool from an intelligent assistant. Without memory, an agent cannot recall previous events within the same task, continue conversations from earlier sessions, or learn from past experiences. Each interaction starts from scratch, forcing users to repeat context and preventing the agent from improving over time.

This chapter addresses memory in three usage patterns. First, we implement context optimization strategies to prevent context explosion during complex problem-solving. Second, we build Session and SessionManager to maintain conversation continuity across multiple interactions, extending this architecture to support asynchronous human-in-the-loop workflows. Finally, we create a long-term memory system that extracts, stores, and retrieves knowledge across session boundaries using vector search.

Figure 6.1 Book structure overview: Chapter 6 in focus.

6.1 The anatomy of agent memory

6.1.1 Limitations of the current memory architecture

6.1.2 Context engineering and memory

6.2 Managing context during execution

6.2.1 Separating storage from presentation

6.2.2 Sliding window strategy

6.2.3 Token counting

6.2.4 Compaction strategy

6.2.5 Summarization strategy

6.2.6 Hierarchical context management

6.3 Continuous execution: Session and state management

6.3.1 The session class

6.3.2 Managing sessions with SessionManager

6.3.3 Integrating sessions into the agent

6.3.4 Basic example: Multi-turn conversation

6.3.5 Data structures for tool confirmation

6.3.6 Extending tools for confirmation

6.3.7 Implementing pause and resume in the agent

6.3.8 Complete example: Human-in-the-loop workflow

6.4 Long-term memory: Accumulating knowledge across sessions

6.4.1 The structure of long-term memory

6.4.2 Information extraction: Structured output

6.4.3 Building a vector store with ChromaDB

6.4.4 Implementing TaskMemoryManager

6.4.5 Retrieving memories

6.5 Summary