Modal.com AI Agents with Persistent Memory
Free · Open source (MIT) · Works with LangChain, CrewAI, AutoGen · No signup
Building AI agents on Modal's serverless GPU platform hits a wall when your agent needs to remember things between function calls. Modal functions are stateless by design — great for scaling, terrible for agent memory.
The Memory Problem on Modal
Modal.com spins up fresh containers for each function invocation. Your AI agent might have a brilliant conversation, learn user preferences, or build up context over multiple interactions, but the moment Modal scales down your container, that memory vanishes.
This breaks agent workflows that depend on persistence:
# This agent forgets everything between Modal calls
@modal.function()
def chat_agent(user_message: str):
# Where did our conversation history go?
# What were the user's preferences?
# What tasks were we working on?
context = ??? # Lost in the void
return generate_response(user_message, context)
Your agent becomes a goldfish, starting fresh every time. Users repeat themselves. Context disappears. Multi-step reasoning falls apart. You need persistent memory that survives Modal's serverless lifecycle.
The Fix: BotWire Memory
Install BotWire to add persistent key-value memory to your Modal AI agents:
pip install botwire
Here's a Modal agent that remembers across function calls:
import modal
from botwire import Memory
app = modal.App("persistent-agent")
@app.function()
def chat_agent(user_id: str, message: str):
# Each user gets isolated memory
memory = Memory(f"user-{user_id}")
# Retrieve conversation history
history = memory.get("conversation") or []
history.append({"role": "user", "content": message})
# Your AI logic here
response = generate_response(history)
history.append({"role": "assistant", "content": response})
# Persist for next time
memory.set("conversation", history)
memory.set("last_seen", datetime.now().isoformat())
return response
How It Works
BotWire Memory is a persistent key-value store designed for agents. The Memory("namespace") constructor creates an isolated memory space — think of it as a Redis namespace that survives process restarts.
from botwire import Memory
# Different namespaces = isolated memory
user_memory = Memory("user-alice")
session_memory = Memory("session-xyz")
agent_memory = Memory("agent-research-bot")
# Store any JSON-serializable data
user_memory.set("preferences", {"language": "python", "style": "concise"})
user_memory.set("task_progress", {"step": 3, "completed": ["research", "outline"]})
# Retrieve with defaults
prefs = user_memory.get("preferences", {})
tasks = user_memory.get("active_tasks", [])
Memory persists across Modal function calls, container restarts, and even different machines. Your agent can pick up exactly where it left off.
For complex state management, you can list all keys in a namespace and delete specific memories:
# See what's stored
all_keys = memory.list_keys() # Returns list of strings
# Clean up old sessions
memory.delete("temp_data")
# Bulk operations for complex agents
for key in memory.list_keys():
if key.startswith("cache_"):
memory.delete(key)
Modal Integration Patterns
Here's a production pattern for a Modal AI agent with different memory scopes:
import modal
from botwire import Memory
from datetime import datetime, timedelta
app = modal.App("multi-memory-agent")
@app.function()
def research_agent(user_id: str, query: str):
# User-specific long-term memory
user_memory = Memory(f"user-{user_id}")
# Session-specific working memory
session_id = user_memory.get("current_session") or f"session-{int(time.time())}"
session_memory = Memory(f"session-{session_id}")
# Global agent knowledge
global_memory = Memory("research-agent-global")
# Track user research interests over time
interests = user_memory.get("research_interests", [])
if query not in interests:
interests.append(query)
user_memory.set("research_interests", interests[-10]) # Keep recent 10
# Build session context
session_context = session_memory.get("context", [])
session_context.append(f"User query: {query}")
# Use global learnings
similar_queries = global_memory.get("query_patterns", {})
# Your AI research logic here...
result = perform_research(query, session_context, similar_queries)
# Update all memory layers
session_context.append(f"Research result: {result['summary']}")
session_memory.set("context", session_context)
session_memory.set("last_updated", datetime.now().isoformat())
user_memory.set("current_session", session_id)
user_memory.set("last_research", query)
# Learn globally (anonymized)
global_memory.set(f"pattern-{len(similar_queries)}", {
"query_type": classify_query(query),
"successful_approach": result['approach']
})
return result
When NOT to Use BotWire
BotWire Memory isn't the right tool for:
• Vector/semantic search — it's key-value storage, not a vector database. Use Pinecone or Weaviate for embeddings. • High-throughput caching — 1000 writes/day limit makes it unsuitable for request-level caching. Use Redis for that. • Sub-millisecond latency — HTTP API adds network overhead. Not suitable for real-time gaming or HFT applications.
FAQ
Why not just use Redis or database connections in Modal? Managing Redis connections, handling failures, and provisioning infrastructure adds complexity. BotWire works immediately with zero setup — no connection strings, no credentials, no infrastructure.
Is this actually free? Yes. 1000 writes per day per namespace, 50MB storage per namespace, unlimited reads. No credit card required. The HTTP API at botwire.dev is free to use.
What about data privacy? BotWire is open source (MIT license) and you can self-host the FastAPI service. For production agents with sensitive data, run your own instance on your infrastructure.
Get Started
Add persistent memory to your Modal AI agents in one line. Install BotWire and your agents will remember everything that matters across serverless function calls.
pip install botwire
Full documentation and self-hosting guide at botwire.dev.