chapter eleven

11 Managing Session State and Chat History

 

This chapter covers

  • How st.session_state works and why Streamlit needs it
  • Why LLMs do not actually remember conversations
  • What the context window is and what happens when you exceed it
  • Implementing conversation history trimming
  • Adding a robust reset button with confirmation

After using your chatbot for a while, you may notice something unexpected. During a long conversation -- say, twenty or thirty exchanges -- the AI's responses start to degrade. It repeats itself, contradicts earlier statements, and takes longer to respond. Your chatbot is not broken. It has hit the context window limit.

This chapter explains why that happens and how to fix it. You will learn how Streamlit's session state works in depth, why LLMs are stateless, what the context window is, and how to manage conversation history to keep your chatbot performing well.

11.1 The Problem: LLMs Are Stateless

Before looking at Streamlit, start with the AI model itself. Stateless means the server does not automatically remember what happened in earlier interactions. An LLM does not remember previous turns on its own. Each call to ollama.chat() is independent. If you want the model to respond as if it remembers the conversation, your app must send the relevant conversation history again with each new request.

11.2 Understanding Streamlit Session State

11.2.1 The Re-Run Problem

11.2.2 Session State in Action

11.2.3 What You Can Store in Session State

11.2.4 When Session State Resets

11.3 How LLMs Use Conversation History

11.4 Context Window Management

11.4.1 Context Window Sizes

11.4.2 What Happens When You Exceed the Context Window

11.4.3 Implementing History Trimming

11.4.4 The Trade-Off

11.5 Adding a Reset Button

11.5.1 Other Useful Session State Patterns

11.6 The Complete Data Flow

11.7 Summary

11.8 Exercises