Short-Term vs Long-Term Memory for AI Agents
Free · Open source (MIT) · Works with LangChain, CrewAI, AutoGen · No signup
AI agents need memory that persists beyond single conversations. Without proper memory architecture, your agent forgets everything between restarts, can't build context across sessions, and burns through token limits on repetitive tasks. This breaks user experience and makes agents feel stateless and dumb.
The Memory Architecture Problem
Most developers building AI agents hit the same wall: where do I store different types of memory? Your agent needs multiple memory layers:
- Short-term agent memory: Current conversation context, recent interactions, temporary state
- Long-term memory agent: User preferences, learned behaviors, persistent facts across sessions
- Working memory: Function call results, API responses, computed values
The common mistake is shoving everything into the LLM's context window. This burns tokens, hits length limits, and doesn't persist across restarts. You need an agent memory architecture that separates concerns:
- Context window: immediate conversation only
- Key-value store: persistent structured data
- Vector DB: semantic search over large knowledge bases
Without this separation, your agent either forgets everything or drowns in irrelevant context. Users complain it "doesn't remember" previous conversations, and your token costs explode.
The Fix: Persistent Key-Value Memory
Install BotWire to add persistent memory to any agent:
pip install botwire
Here's the basic pattern for memory layers ai:
from botwire import Memory
# Initialize memory with a namespace
memory = Memory("user-42")
# Store long-term facts
memory.set("user_name", "Sarah")
memory.set("preferences", {"language": "python", "framework": "fastapi"})
memory.set("conversation_count", 15)
# Retrieve in your agent logic
user_name = memory.get("user_name")
prefs = memory.get("preferences")
print(f"Welcome back, {user_name}! You've had {memory.get('conversation_count')} conversations.")
Memory Architecture Walkthrough
The key insight: different data belongs in different layers. Here's how to structure your agent's memory:
from botwire import Memory
import json
from datetime import datetime
class AgentMemory:
def __init__(self, user_id: str):
self.memory = Memory(f"agent-{user_id}")
self.user_id = user_id
def remember_user_fact(self, key: str, value):
"""Long-term memory: facts that persist across sessions"""
self.memory.set(f"user:{key}", value)
def remember_conversation_summary(self, summary: str):
"""Session memory: compress old conversations"""
timestamp = datetime.now().isoformat()
self.memory.set(f"summary:{timestamp}", summary)
def get_user_context(self):
"""Retrieve context for prompt engineering"""
name = self.memory.get("user:name")
role = self.memory.get("user:role")
recent_summaries = self._get_recent_summaries(limit=3)
return {
"name": name,
"role": role,
"conversation_history": recent_summaries
}
def _get_recent_summaries(self, limit: int):
"""Get recent conversation summaries"""
all_keys = self.memory.list_keys()
summary_keys = [k for k in all_keys if k.startswith("summary:")]
# Sort by timestamp (keys are ISO format)
recent_keys = sorted(summary_keys, reverse=True)[:limit]
return [self.memory.get(key) for key in recent_keys]
# Usage in your agent
agent_memory = AgentMemory("user-42")
agent_memory.remember_user_fact("name", "Sarah")
agent_memory.remember_user_fact("role", "backend engineer")
# Later, in your agent's prompt
context = agent_memory.get_user_context()
This pattern separates short-term agent memory (current conversation) from long-term memory agent patterns (persistent user facts). The memory survives restarts, works across processes, and doesn't pollute your context window.
For cleanup and TTL-like behavior, you can implement your own retention policy:
def cleanup_old_summaries(self, days: int = 30):
"""Remove conversation summaries older than N days"""
cutoff = datetime.now() - timedelta(days=days)
all_keys = self.memory.list_keys()
for key in all_keys:
if key.startswith("summary:"):
timestamp_str = key.replace("summary:", "")
timestamp = datetime.fromisoformat(timestamp_str)
if timestamp < cutoff:
self.memory.delete(key)
Framework Integration
For LangChain agents, use the chat history adapter:
from botwire import BotWireChatHistory
from langchain.memory import ConversationBufferMemory
# Persistent chat history
chat_history = BotWireChatHistory(session_id="user-42")
memory = ConversationBufferMemory(
chat_memory=chat_history,
return_messages=True
)
# Use in your LangChain agent
from langchain.agents import AgentExecutor, create_openai_functions_agent
agent_executor = AgentExecutor(
agent=your_agent,
tools=your_tools,
memory=memory, # Now persists across sessions
verbose=True
)
For CrewAI, add memory tools to your crew:
from botwire.memory import memory_tools
# Get remember/recall/list_memory tools
tools = memory_tools("crew-project-alpha")
# Add to your CrewAI agent
agent = Agent(
role="Research Assistant",
goal="Remember important facts across tasks",
tools=tools + other_tools,
backstory="You can remember things between conversations..."
)
When NOT to Use BotWire
BotWire isn't the right choice for:
- Vector/semantic search: Use Pinecone, Weaviate, or Chroma for embedding-based retrieval
- High-throughput applications: 1000 writes/day limit on free tier, not built for massive scale
- Sub-millisecond latency: HTTP-based, not optimized for ultra-low latency use cases
FAQ
Why not just use Redis or a database? BotWire is zero-config with no infrastructure setup. Redis requires hosting, connection management, and persistence configuration. For agent memory, simple key-value with HTTP API is often enough.
Is this actually free? Yes, 1000 writes/day per namespace forever. Unlimited reads. No signup required. You can also self-host the MIT-licensed version.
What about data privacy? Data is stored on BotWire's servers by default. For sensitive use cases, self-host the open source version (single FastAPI + SQLite service) or use a private namespace.
Get Started
Add persistent memory to your AI agent in under 2 minutes. pip install botwire and start building agents that actually remember.
Try it at https://botwire.dev — no signup required.