Short-Term vs Long-Term Memory for AI Agents

Free · Open source (MIT) · Works with LangChain, CrewAI, AutoGen · No signup

AI agents need memory that persists beyond single conversations. Without proper memory architecture, your agent forgets everything between restarts, can't build context across sessions, and burns through token limits on repetitive tasks. This breaks user experience and makes agents feel stateless and dumb.

The Memory Architecture Problem

Most developers building AI agents hit the same wall: where do I store different types of memory? Your agent needs multiple memory layers:

The common mistake is shoving everything into the LLM's context window. This burns tokens, hits length limits, and doesn't persist across restarts. You need an agent memory architecture that separates concerns:

Without this separation, your agent either forgets everything or drowns in irrelevant context. Users complain it "doesn't remember" previous conversations, and your token costs explode.

The Fix: Persistent Key-Value Memory

Install BotWire to add persistent memory to any agent:

pip install botwire

Here's the basic pattern for memory layers ai:

from botwire import Memory

# Initialize memory with a namespace
memory = Memory("user-42")

# Store long-term facts
memory.set("user_name", "Sarah")
memory.set("preferences", {"language": "python", "framework": "fastapi"})
memory.set("conversation_count", 15)

# Retrieve in your agent logic
user_name = memory.get("user_name")
prefs = memory.get("preferences")

print(f"Welcome back, {user_name}! You've had {memory.get('conversation_count')} conversations.")

Memory Architecture Walkthrough

The key insight: different data belongs in different layers. Here's how to structure your agent's memory:

from botwire import Memory
import json
from datetime import datetime

class AgentMemory:
    def __init__(self, user_id: str):
        self.memory = Memory(f"agent-{user_id}")
        self.user_id = user_id
    
    def remember_user_fact(self, key: str, value):
        """Long-term memory: facts that persist across sessions"""
        self.memory.set(f"user:{key}", value)
    
    def remember_conversation_summary(self, summary: str):
        """Session memory: compress old conversations"""
        timestamp = datetime.now().isoformat()
        self.memory.set(f"summary:{timestamp}", summary)
    
    def get_user_context(self):
        """Retrieve context for prompt engineering"""
        name = self.memory.get("user:name")
        role = self.memory.get("user:role") 
        recent_summaries = self._get_recent_summaries(limit=3)
        
        return {
            "name": name,
            "role": role,
            "conversation_history": recent_summaries
        }
    
    def _get_recent_summaries(self, limit: int):
        """Get recent conversation summaries"""
        all_keys = self.memory.list_keys()
        summary_keys = [k for k in all_keys if k.startswith("summary:")]
        # Sort by timestamp (keys are ISO format)
        recent_keys = sorted(summary_keys, reverse=True)[:limit]
        return [self.memory.get(key) for key in recent_keys]

# Usage in your agent
agent_memory = AgentMemory("user-42")
agent_memory.remember_user_fact("name", "Sarah")
agent_memory.remember_user_fact("role", "backend engineer")

# Later, in your agent's prompt
context = agent_memory.get_user_context()

This pattern separates short-term agent memory (current conversation) from long-term memory agent patterns (persistent user facts). The memory survives restarts, works across processes, and doesn't pollute your context window.

For cleanup and TTL-like behavior, you can implement your own retention policy:

def cleanup_old_summaries(self, days: int = 30):
    """Remove conversation summaries older than N days"""
    cutoff = datetime.now() - timedelta(days=days)
    all_keys = self.memory.list_keys()
    
    for key in all_keys:
        if key.startswith("summary:"):
            timestamp_str = key.replace("summary:", "")
            timestamp = datetime.fromisoformat(timestamp_str)
            if timestamp < cutoff:
                self.memory.delete(key)

Framework Integration

For LangChain agents, use the chat history adapter:

from botwire import BotWireChatHistory
from langchain.memory import ConversationBufferMemory

# Persistent chat history
chat_history = BotWireChatHistory(session_id="user-42")
memory = ConversationBufferMemory(
    chat_memory=chat_history,
    return_messages=True
)

# Use in your LangChain agent
from langchain.agents import AgentExecutor, create_openai_functions_agent

agent_executor = AgentExecutor(
    agent=your_agent,
    tools=your_tools,
    memory=memory,  # Now persists across sessions
    verbose=True
)

For CrewAI, add memory tools to your crew:

from botwire.memory import memory_tools

# Get remember/recall/list_memory tools
tools = memory_tools("crew-project-alpha")

# Add to your CrewAI agent
agent = Agent(
    role="Research Assistant",
    goal="Remember important facts across tasks",
    tools=tools + other_tools,
    backstory="You can remember things between conversations..."
)

When NOT to Use BotWire

BotWire isn't the right choice for:

FAQ

Why not just use Redis or a database? BotWire is zero-config with no infrastructure setup. Redis requires hosting, connection management, and persistence configuration. For agent memory, simple key-value with HTTP API is often enough.

Is this actually free? Yes, 1000 writes/day per namespace forever. Unlimited reads. No signup required. You can also self-host the MIT-licensed version.

What about data privacy? Data is stored on BotWire's servers by default. For sensitive use cases, self-host the open source version (single FastAPI + SQLite service) or use a private namespace.

Get Started

Add persistent memory to your AI agent in under 2 minutes. pip install botwire and start building agents that actually remember.

Try it at https://botwire.dev — no signup required.

Install in one command:

pip install botwire

Start free at botwire.dev