Skip to content

§ Interfaces

Python SDK. Memory as a Service for Python.

Add persistent, searchable memory to your AI agents, chatbots, and applications with a few lines of code. Sync and async clients, full type hints, and automatic retries.

Package

octamem (PyPI)

Version

1.0.1

Requires

Python 3.9+

Quick start

Get from zero to a working call in under 30 seconds.

1. Install

terminalbash
pip install octamem

2. Run

quick_start.pypython
from octamem import OctaMem

client = OctaMem(api_key="your-api-key-here")

# Search memories (previous_context scopes the query to a conversation or topic)
results = client.get(
    query="What did we decide about the launch date?",
    previous_context="Q1 product planning meeting",
)
print(results)

# Add a memory (previous_context links this to a conversation or topic)
client.add(
    content="Launch date set for April 15. Beta opens March 20.",
    previous_context="Q1 product planning meeting",
)

# Get usage and limits
details = client.details()
print(details)

Using an environment variable instead: set OCTAMEM_API_KEY and call OctaMem() with no arguments.

Use cases

  • AI agents & chatbots — Give your bot a persistent memory so it remembers context and past conversations.
  • Personal knowledge bases — Store and search notes, decisions, and learnings by natural language.
  • Multi-tenant apps — One API key per user or tenant; isolate memories per client.
  • RAG and retrieval — Use OctaMem as the memory layer and query with natural language instead of managing embeddings yourself.

Installation

From PyPI (recommended).

terminalbash
pip install octamem

Then import octamem works from any Python file in that environment.

API at a glance

  • get(...)query, previous_context="" (or params: GetParams)SearchResultSearch memories by natural language. Sends {"query", "previousContext"} to /api/memory/search.
  • add(...)content, previous_context="" (or params: AddParams)AddResultStore content in memory. Sends {"content", "previousContext"} to /api/memory/add.
  • details()(none)MemoryDetailsGet usage and limits. Sends {} to /api/memory/details.

All methods exist on sync (OctaMem) and async (AsyncOctaMem); async uses await client.get(...) etc.

Parameter styles

Keyword arguments (recommended) or a params dict (keys: query, previousContext / content, previousContext).

param_styles.pypython
# Keyword style (recommended) — e.g. chatbot recalling a support conversation
client.get(
    query="What did the customer say about their billing issue?",
    previous_context="Support conversation with customer #4521",
)
client.add(
    content="Customer requested a refund for the duplicate charge. Ticket escalated.",
    previous_context="Support conversation with customer #4521",
)

# Params dict
client.get(params={
    "query": "What are the user's notification preferences?",
    "previousContext": "User settings",
})
client.add(params={
    "content": "User prefers weekly digest and push for mentions only.",
    "previousContext": "User settings",
})

Response types (from the SDK)

The SDK returns Pydantic models; all allow extra fields from the API (.model_dump() for dicts).

  • get(...)SearchResultpending_id: str | NoneExtra: Allowed
  • add(...)AddResultsuccess: boolExtra: Allowed
  • details()MemoryDetailsusage: int | None, limit: int | NoneExtra: Allowed

Example usage

responses.pypython
from octamem import OctaMem

client = OctaMem(api_key="your-api-key")

# Search — SearchResult (pending_id + any extra API fields)
results = client.get(
    query="What did we decide about the launch date?",
    previous_context="Q1 product planning meeting",
)
print(results)
print(results.pending_id)
print(results.model_dump())

# Add — AddResult (success + any extra API fields)
add = client.add(
    content="Launch date set for April 15. Beta opens March 20.",
    previous_context="Q1 product planning meeting",
)
print(add.success)

# Details — MemoryDetails (usage, limit)
details = client.details()
print(details)
print(details.usage, details.limit)

Sync vs async

  • Sync (OctaMem) — Use in scripts, CLI tools, or any blocking context. Simple client.get(), client.add(), client.details().
  • Async (AsyncOctaMem) — Use in async frameworks (FastAPI, asyncio, etc.). Same API with await.

Same capabilities; pick the style that matches your stack.

Async example

async_basic.pypython
import asyncio
from octamem import AsyncOctaMem

async def main():
    client = AsyncOctaMem(api_key="your-api-key")
    results = await client.get(
        query="What are the user's notification preferences?",
        previous_context="User onboarding flow",
    )
    await client.add(
        content="User chose weekly digest emails and in-app notifications only.",
        previous_context="User onboarding flow",
    )
    details = await client.details()
    print(details)
    await client.close()

asyncio.run(main())

Or with a context manager:

async_ctx.pypython
async with AsyncOctaMem(api_key="your-api-key") as client:
    results = await client.get(
        query="What did we decide about the launch?",
        previous_context="Product roadmap sync",
    )
    await client.add(
        content="Ship MVP by end of Q2. Skip dark mode for v1.",
        previous_context="Product roadmap sync",
    )
    details = await client.details()
    print(details)

Configuration

Constructor options (same for OctaMem and AsyncOctaMem). All except api_key are keyword-only.

  • api_keyos.environ.get("OCTAMEM_API_KEY")str | NoneAPI key. Prefer env for security.
  • base_urlhttps://platform.octamem.comstrAPI base URL.
  • timeout30.0floatRequest timeout in seconds.
  • max_retries3intRetries on connection / timeout / rate-limit / 5xx.
  • retry_delay1.0floatBase delay between retries (seconds).
config.pypython
# With explicit api_key (or omit to use OCTAMEM_API_KEY env var)
client = OctaMem(
    api_key="your-api-key",
    base_url="https://platform.octamem.com",
    timeout=30.0,
    max_retries=3,
    retry_delay=1.0,
)

Error handling and retries

The client retries on connection errors, timeouts, rate limits, and 5xx responses. You can still handle specific errors and implement custom retry logic.

Catch specific errors

errors.pypython
from octamem import (
    OctaMem,
    AuthenticationError,
    RateLimitError,
    APIConnectionError,
    ValidationError,
)

client = OctaMem(api_key="your-api-key")

try:
    client.add(content="New note", previous_context="")
except AuthenticationError:
    print("Invalid or missing API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after: {e.retry_after}s")
except APIConnectionError:
    print("Network error")
except ValidationError as e:
    print(f"Validation error: {e.message}")

Retry on rate limit (example)

The SDK already retries with backoff; if you want custom logic (e.g. log and retry once):

retry.pypython
import time
from octamem import OctaMem, RateLimitError

client = OctaMem(api_key="your-api-key")

def add_with_retry(content: str, previous_context: str = "", max_attempts: int = 2):
    for attempt in range(max_attempts):
        try:
            return client.add(content=content, previous_context=previous_context)
        except RateLimitError as e:
            if attempt == max_attempts - 1:
                raise
            wait = e.retry_after or 60
            time.sleep(wait)

add_with_retry("Important note", previous_context="")

Exception hierarchy

exceptionstext
OctaMemError (base)
├── AuthenticationError       # 401
├── MethodNotSupportedError    # 403
├── NotFoundError             # 404
├── ValidationError           # 400
├── RateLimitError            # 429 (has .retry_after)
├── APIError                  # Other API errors
├── APIConnectionError        # Network
├── APITimeoutError           # Timeout
└── InternalServerError       # 5xx

Context managers

Both clients support context managers so connections are closed cleanly.

ctx_sync.pypython
with OctaMem(api_key="your-api-key") as client:
    results = client.get(
        query="What did the customer ask for?",
        previous_context="Support chat session #8823",
    )
    details = client.details()
    print(details)
ctx_async.pypython
async with AsyncOctaMem(api_key="your-api-key") as client:
    results = await client.get(
        query="What were the action items?",
        previous_context="Weekly standup March 15",
    )
    await client.add(
        content="Alice: finish API docs. Bob: review auth flow.",
        previous_context="Weekly standup March 15",
    )
    details = await client.details()
    print(details)

Type safety

The SDK is fully typed (PEP 561, py.typed). Exported types: Params: GetParams (query, previousContext), AddParams (content, previousContext). Responses: SearchResult, AddResult, MemoryDetails. In params dicts the key is previousContext (camelCase) to match the API.

types.pypython
from octamem import (
    OctaMem,
    GetParams,
    AddParams,
    MemoryDetails,
    SearchResult,
    AddResult,
)

client = OctaMem(api_key="your-api-key")

params: GetParams = {
    "query": "What did we decide about pricing?",
    "previousContext": "Sales strategy call",
}
results: SearchResult = client.get(params)

add_params: AddParams = {
    "content": "Free tier: 10k memories. Pro: unlimited.",
    "previousContext": "Sales strategy call",
}
out: AddResult = client.add(add_params)

details: MemoryDetails = client.details()
print(details)

Optional: previous_context

Both get and add accept an optional previous_context argument (sent as previousContext in the request body). Use it to scope or link memories to a conversation or topic.

previous_context.pypython
from octamem import OctaMem

client = OctaMem(api_key="your-api-key")

results = client.get(
    query="What did the user prefer for notifications?",
    previous_context="Onboarding conversation",
)
add = client.add(
    content="User selected email digest weekly and Slack for alerts.",
    previous_context="Onboarding conversation",
)
details = client.details()
print(details)

Requirements

  • Python 3.9+
  • httpx
  • pydantic

Why OctaMem?

  • Memory as a service — No need to run your own vector DB or manage embeddings; focus on your product.
  • Natural-language search — Query with questions or phrases instead of building search pipelines.
  • Built for AI — Designed for agents and chatbots that need persistent, searchable memory across sessions.
  • Simple API — Three main operations: search, add, details. Sync and async, with retries and type safety out of the box.