11 Managing Session State and Chat History
This chapter covers
- How st.session_state works and why Streamlit needs it
- Why LLMs do not actually remember conversations
- What the context window is and what happens when you exceed it
- Implementing conversation history trimming
- Adding a robust reset button with confirmation
After using your chatbot for a while, you may notice something unexpected. During a long conversation -- say, twenty or thirty exchanges -- the AI's responses start to degrade. It repeats itself, contradicts earlier statements, and takes longer to respond. Your chatbot is not broken. It has hit the context window limit.
This chapter explains why that happens and how to fix it. You will learn how Streamlit's session state works in depth, why LLMs are stateless, what the context window is, and how to manage conversation history to keep your chatbot performing well.
11.1 The Problem: LLMs Are Stateless
Before looking at Streamlit, start with the AI model itself. Stateless means the server does not automatically remember what happened in earlier interactions. An LLM does not remember previous turns on its own. Each call to ollama.chat() is independent. If you want the model to respond as if it remembers the conversation, your app must send the relevant conversation history again with each new request.