Adding External Memory to the OpenAI Assistants API

Free · Open source (MIT) · Works with LangChain, CrewAI, AutoGen · No signup

The OpenAI Assistants API stores conversation history in threads, but this memory dies when threads end. Developers building multi-session agents need persistent memory that survives across conversations, users, and application restarts. Here's how to add external, searchable memory to your OpenAI assistants without complex infrastructure.

The Memory Problem with OpenAI Assistants

OpenAI threads are ephemeral containers. When a thread closes, everything the assistant learned disappears. This breaks user experiences that depend on continuity:

import openai

# Thread 1: User teaches the assistant something
thread1 = client.beta.threads.create()
client.beta.threads.messages.create(
    thread_id=thread1.id,
    role="user",
    content="My favorite coffee is Ethiopian Yirgacheffe"
)

# Thread 2: Assistant has no memory of the preference
thread2 = client.beta.threads.create()  # Blank slate again

Your assistant can't remember user preferences, previous conversations, or learned facts. Each interaction starts from zero. For customer service bots, personal assistants, or any agent that should improve over time, this memory gap kills the experience.

Adding Persistent Memory

Install BotWire to give your assistants external memory that persists across threads, sessions, and deployments:

pip install botwire

Here's how to store and retrieve persistent state alongside the Assistants API:

from botwire import Memory
import openai

# Initialize memory with a namespace
memory = Memory("user-preferences")

# Store persistent data
memory.set("user_123_coffee", "Ethiopian Yirgacheffe")
memory.set("user_123_meetings", ["Monday 9am", "Wednesday 2pm"])

# Retrieve in any thread
favorite_coffee = memory.get("user_123_coffee")
print(favorite_coffee)  # "Ethiopian Yirgacheffe"

This memory survives application restarts and works across different threads for the same user.

Memory Patterns for Assistants

The key is deciding what to store externally versus what stays in the OpenAI thread. Threads handle conversation flow; external memory handles facts, preferences, and cross-session state.

User Preferences and Facts

from botwire import Memory

user_memory = Memory(f"user-{user_id}")

# Store structured preferences
user_memory.set("notification_time", "9:00 AM")
user_memory.set("timezone", "US/Pacific") 
user_memory.set("communication_style", "direct")

# Retrieve before creating assistant messages
prefs = {
    "time": user_memory.get("notification_time"),
    "style": user_memory.get("communication_style")
}

Cross-User Knowledge Base

# Shared knowledge across all users
kb = Memory("company-knowledge")
kb.set("office_hours", "9 AM - 6 PM PST")
kb.set("return_policy", "30 days, receipt required")

# Any assistant can access this
policy = kb.get("return_policy")

Listing and Managing Memory

# See all stored keys
all_keys = memory.list_keys()
print(f"Stored: {len(all_keys)} items")

# Clean up old data
memory.delete("outdated_preference")

# Check if data exists
if memory.get("user_onboarded"):
    # Skip intro flow
    pass

Memory persists across processes and machines. Your production deployment automatically inherits memory from development and staging environments using the same namespace.

Integration with OpenAI Assistants

Combine BotWire memory with the Assistants API to create context-aware responses:

import openai
from botwire import Memory

client = openai.OpenAI()
memory = Memory("assistant-context")

def chat_with_memory(user_id, message):
    # Load user context
    context = memory.get(f"context_{user_id}") or {}
    
    # Create thread with enriched context
    thread = client.beta.threads.create()
    
    # Add context to system message if available
    system_context = f"User preferences: {context}" if context else ""
    
    client.beta.threads.messages.create(
        thread_id=thread.id,
        role="user", 
        content=f"{system_context}\n\nUser: {message}"
    )
    
    # Get assistant response
    run = client.beta.threads.runs.create(
        thread_id=thread.id,
        assistant_id="asst-xxx"
    )
    
    # Store important facts for next time
    if "remember" in message.lower():
        memory.set(f"context_{user_id}", {"last_topic": message})
    
    return run

When NOT to Use BotWire Memory

Vector search: BotWire stores key-value pairs, not embeddings. Use Pinecone or Weaviate for semantic search.
High-frequency operations: The HTTP API adds latency. For sub-millisecond access, use Redis or in-memory storage.
Sensitive data: Memory is stored externally. For PII or secrets, use encrypted local storage or secure vaults.

FAQ

Q: Why not just use Redis or a database? A: You can, but then you need hosting, credentials, connection pooling, and error handling. BotWire works instantly with zero setup.

Q: Is this actually free? A: Yes, 1000 writes per day per namespace, 50MB storage, unlimited reads. No credit card, no trial expiration.

Q: What about data privacy? A: Data is stored on BotWire's servers. For sensitive applications, self-host the open-source version (single FastAPI + SQLite service).

BotWire Memory solves the OpenAI Assistants API persistence gap with zero configuration. Install with pip install botwire and start building assistants that actually remember. Full documentation at botwire.dev.

Install in one command:

pip install botwire

Start free at botwire.dev