Persistent Memory for Voice AI Agents (ElevenLabs, Vapi, Retell)

Free · Open source (MIT) · Works with LangChain, CrewAI, AutoGen · No signup

Your voice AI agent works perfectly for one call, then forgets everything. The customer calls back asking about their order from yesterday, and your bot has no clue what they're talking about. You need persistent memory that survives across calls, restarts, and deployments.

The Memory Problem

Voice agents built with ElevenLabs, Vapi, or Retell AI are stateless by design. Each phone call starts fresh with no context from previous interactions. Your agent might perfectly handle a customer's insurance claim, but when they call back an hour later for an update, it's like talking to a goldfish.

This breaks customer experience badly. Imagine calling your bank and having to re-explain your entire situation every single time. The agent can't remember:

Previous conversation context
Customer preferences or details
Ongoing tasks or requests
Historical interactions

The root cause? These voice platforms focus on real-time conversation, not persistence. They handle the complex audio processing and natural language understanding, but leave memory management to you. Without explicit memory storage, everything dies when the call ends.

The Fix

Install BotWire Memory to add persistent storage that survives calls, crashes, and deployments:

pip install botwire

Here's the core solution:

from botwire import Memory

# Initialize memory with a namespace (e.g., customer ID or phone number)
memory = Memory("customer-555-0123")

# During a call - save important context
memory.set("last_order", "ORDER-789")
memory.set("preferred_name", "Sarah")
memory.set("issue_status", "pending_review")

# Next call - retrieve the context
previous_order = memory.get("last_order")  # "ORDER-789"
name = memory.get("preferred_name")        # "Sarah"

print(f"Welcome back {name}! I see your order {previous_order} is still pending.")

How It Works

BotWire Memory is a persistent key-value store designed specifically for agent memory. Each Memory("namespace") creates an isolated storage space - typically one per customer or conversation thread.

The data persists on BotWire's backend, so it survives process restarts, server crashes, and redeployments. No setup required - it works immediately after installation.

from botwire import Memory
import json

memory = Memory("customer-alice")

# Store complex data as JSON
customer_context = {
    "orders": ["ORDER-123", "ORDER-456"],
    "preferences": {"language": "english", "callback_time": "morning"},
    "last_issue": "shipping_delay"
}
memory.set("context", json.dumps(customer_context))

# Retrieve and parse
stored_context = json.loads(memory.get("context") or "{}")
print(f"Customer has {len(stored_context.get('orders', []))} previous orders")

For conversation history, BotWire includes a LangChain adapter that many voice frameworks can consume:

from botwire import BotWireChatHistory

# Each customer gets their own conversation history
history = BotWireChatHistory(session_id="customer-555-0123")

# This automatically persists across calls
history.add_user_message("I want to check my order status")
history.add_ai_message("Sure! Let me look that up for you...")

# Later calls can access the full conversation
messages = history.messages  # All previous exchanges

Memory operations work across processes and machines. If your voice agent scales across multiple servers, they all share the same persistent memory for each customer.

Voice Platform Integration

Most voice platforms let you inject custom context or use webhook callbacks. Here's how to integrate persistent memory with your voice agent:

from botwire import Memory
import json

def handle_voice_webhook(customer_phone, user_message):
    # Use phone number as memory namespace
    memory = Memory(f"phone-{customer_phone}")
    
    # Retrieve customer context
    context = json.loads(memory.get("context") or "{}")
    
    # Add conversation context for the AI
    system_prompt = f"""
    Customer context: {context.get('summary', 'New customer')}
    Previous issues: {context.get('issues', [])}
    Preferences: {context.get('preferences', {})}
    
    Handle this message: {user_message}
    """
    
    # After AI responds, update memory
    context['last_contact'] = "2024-01-15"
    context['summary'] = "Customer asking about shipping"
    memory.set("context", json.dumps(context))
    
    return system_prompt

This pattern works with ElevenLabs conversational AI, Vapi's webhook system, or Retell AI's custom functions.

When NOT to Use BotWire

BotWire Memory isn't right for every use case:

• Vector search or embeddings - Use Pinecone, Weaviate, or Chroma for semantic similarity search • High-throughput applications - The free tier has 1000 writes/day per namespace • Sub-millisecond latency - HTTP calls add ~50-200ms overhead vs in-memory storage

FAQ

Why not just use Redis or a database? BotWire works immediately with zero configuration. No servers to manage, connection strings, or authentication. For agent memory, you want to focus on your logic, not infrastructure.

Is this actually free? Yes, forever. 1000 writes per day per namespace, 50MB storage per namespace, unlimited reads. No credit card required.

What about data privacy? Data is stored on BotWire's backend. For sensitive applications, you can self-host since it's open source (MIT license). The entire service is a single FastAPI + SQLite application.

Get Started

Stop losing customer context between voice calls. Install BotWire Memory and give your voice agents persistent memory in under 5 minutes.

pip install botwire

Documentation and examples at https://botwire.dev

Install in one command:

pip install botwire

Start free at botwire.dev