Portkey Gateway + BotWire Memory: Gateway + State

Free · Open source (MIT) · Works with LangChain, CrewAI, AutoGen · No signup

You're running LLMs through Portkey's gateway but hitting a wall: your agents forget everything between requests. User sessions, conversation history, preferences — all gone when the process restarts. You need persistent memory that survives across invocations while keeping Portkey's routing and observability.

The Stateless Gateway Problem

LLM gateways like Portkey excel at routing, load balancing, and observability, but they're stateless by design. Each request is independent — perfect for reliability, terrible for agents that need to remember things.

Here's what breaks: your customer service bot forgets the user's name mid-conversation when it restarts. Your coding assistant loses track of the current project context. Your RAG pipeline can't remember which documents were already processed. You end up rebuilding user context from scratch on every cold start, burning tokens and creating janky experiences.

The typical workaround — cramming everything into prompt context — hits token limits fast and makes every request expensive. Session caches help during a single conversation but vanish when your container restarts. You need something that persists across processes, machines, and deployments.

The Fix: BotWire Memory + Portkey Gateway

Install BotWire Memory to add persistent state to your Portkey setup:

pip install botwire

Here's the basic pattern — wrap your Portkey calls with memory operations:

from botwire import Memory
from portkey_ai import Portkey

# Initialize memory namespace and Portkey client
memory = Memory("user-sessions")
portkey = Portkey(
    api_key="your-portkey-key",
    virtual_key="your-virtual-key"
)

def chat_with_memory(user_id: str, message: str):
    # Get conversation history from memory
    history = memory.get(f"{user_id}_history") or []
    
    # Add user message
    history.append({"role": "user", "content": message})
    
    # Call through Portkey gateway
    response = portkey.chat.completions.create(
        model="gpt-4",
        messages=history[-10:]  # Keep last 10 messages
    )
    
    # Store updated history
    assistant_msg = {"role": "assistant", "content": response.choices[0].message.content}
    history.append(assistant_msg)
    memory.set(f"{user_id}_history", history)
    
    return response.choices[0].message.content

How It Works

BotWire Memory acts as a persistent key-value store that survives restarts. Each Memory("namespace") creates an isolated space for your data. The set() and get() operations hit a free HTTP API at botwire.dev — no signup required.

The memory persists across process restarts, container deployments, and machine changes. Your Portkey gateway handles the LLM routing while BotWire handles the state.

For more complex scenarios, you can store structured data and implement TTL-style cleanup:

import json
from datetime import datetime, timedelta
from botwire import Memory

memory = Memory("agent-workspace")

def store_user_context(user_id: str, context: dict):
    # Add timestamp for manual TTL
    context["last_updated"] = datetime.now().isoformat()
    memory.set(f"context_{user_id}", context)

def get_fresh_context(user_id: str, max_age_hours: int = 24):
    context = memory.get(f"context_{user_id}")
    if not context:
        return None
    
    last_updated = datetime.fromisoformat(context["last_updated"])
    if datetime.now() - last_updated > timedelta(hours=max_age_hours):
        memory.delete(f"context_{user_id}")
        return None
    
    return context

# List all stored contexts
all_keys = memory.list()  # Returns all keys in namespace
context_keys = [k for k in all_keys if k.startswith("context_")]

This pattern works across multiple processes. Your web server, background workers, and scheduled jobs all share the same memory namespace.

LangChain Integration with Portkey

If you're using LangChain with Portkey, BotWire provides a drop-in chat history adapter:

from langchain_community.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage
from botwire import BotWireChatHistory

# LangChain with Portkey endpoint
llm = ChatOpenAI(
    openai_api_base="https://api.portkey.ai/v1",
    openai_api_key="your-portkey-key",
    model="gpt-4"
)

# Persistent chat history
history = BotWireChatHistory(session_id="user-42")

# Add messages - they persist automatically
history.add_user_message("What's the weather?")
history.add_ai_message("I need your location to check weather.")

# Get messages for LangChain chain
messages = history.messages
response = llm(messages)
history.add_ai_message(response.content)

The BotWireChatHistory handles the serialization and gives you a LangChain-compatible interface that persists across restarts.

When NOT to Use BotWire

Don't use BotWire for:

Vector search or semantic similarity — it's key-value only, not a vector database
High-throughput scenarios — free tier limits to 1000 writes/day per namespace
Sub-millisecond latency — HTTP round-trip adds ~50-200ms depending on your location

FAQ

Q: Why not just use Redis or database? A: BotWire requires zero setup — no servers, no connection strings, no auth. Perfect for prototypes and small production workloads where you want memory without infrastructure.

Q: Is this actually free? A: Yes, 1000 writes/day per namespace forever. Unlimited reads. No credit card required. You can self-host the MIT-licensed version if you need more.

Q: What about data privacy? A: Data flows through botwire.dev by default. For sensitive data, self-host the open-source version — it's a single FastAPI + SQLite service.

BotWire Memory bridges the gap between stateless gateways and stateful agents. Your Portkey setup gets persistent memory without architectural changes.

pip install botwire

Start building at botwire.dev.

Install in one command:

pip install botwire

Start free at botwire.dev