§ Interfaces
Python SDK. Memory as a Service for Python.
Add persistent, searchable memory to your AI agents, chatbots, and applications with a few lines of code. Sync and async clients, full type hints, and automatic retries.
Package
octamem (PyPI)
Version
1.0.1
Requires
Python 3.9+
Quick start
Get from zero to a working call in under 30 seconds.
1. Install
pip install octamem2. Run
from octamem import OctaMem
client = OctaMem(api_key="your-api-key-here")
# Search memories (previous_context scopes the query to a conversation or topic)
results = client.get(
query="What did we decide about the launch date?",
previous_context="Q1 product planning meeting",
)
print(results)
# Add a memory (previous_context links this to a conversation or topic)
client.add(
content="Launch date set for April 15. Beta opens March 20.",
previous_context="Q1 product planning meeting",
)
# Get usage and limits
details = client.details()
print(details)Using an environment variable instead: set OCTAMEM_API_KEY and call OctaMem() with no arguments.
Use cases
- AI agents & chatbots — Give your bot a persistent memory so it remembers context and past conversations.
- Personal knowledge bases — Store and search notes, decisions, and learnings by natural language.
- Multi-tenant apps — One API key per user or tenant; isolate memories per client.
- RAG and retrieval — Use OctaMem as the memory layer and query with natural language instead of managing embeddings yourself.
Installation
From PyPI (recommended).
pip install octamemThen import octamem works from any Python file in that environment.
API at a glance
- get(...)
query,previous_context=""(orparams: GetParams)SearchResultSearch memories by natural language. Sends{"query", "previousContext"}to/api/memory/search. - add(...)
content,previous_context=""(orparams: AddParams)AddResultStore content in memory. Sends{"content", "previousContext"}to/api/memory/add. - details()(none)MemoryDetailsGet usage and limits. Sends
{}to/api/memory/details.
All methods exist on sync (OctaMem) and async (AsyncOctaMem); async uses await client.get(...) etc.
Parameter styles
Keyword arguments (recommended) or a params dict (keys: query, previousContext / content, previousContext).
# Keyword style (recommended) — e.g. chatbot recalling a support conversation
client.get(
query="What did the customer say about their billing issue?",
previous_context="Support conversation with customer #4521",
)
client.add(
content="Customer requested a refund for the duplicate charge. Ticket escalated.",
previous_context="Support conversation with customer #4521",
)
# Params dict
client.get(params={
"query": "What are the user's notification preferences?",
"previousContext": "User settings",
})
client.add(params={
"content": "User prefers weekly digest and push for mentions only.",
"previousContext": "User settings",
})Response types (from the SDK)
The SDK returns Pydantic models; all allow extra fields from the API (.model_dump() for dicts).
- get(...)SearchResult
pending_id: str | NoneExtra: Allowed - add(...)AddResult
success: boolExtra: Allowed - details()MemoryDetails
usage: int | None, limit: int | NoneExtra: Allowed
Example usage
from octamem import OctaMem
client = OctaMem(api_key="your-api-key")
# Search — SearchResult (pending_id + any extra API fields)
results = client.get(
query="What did we decide about the launch date?",
previous_context="Q1 product planning meeting",
)
print(results)
print(results.pending_id)
print(results.model_dump())
# Add — AddResult (success + any extra API fields)
add = client.add(
content="Launch date set for April 15. Beta opens March 20.",
previous_context="Q1 product planning meeting",
)
print(add.success)
# Details — MemoryDetails (usage, limit)
details = client.details()
print(details)
print(details.usage, details.limit)Sync vs async
- Sync (
OctaMem) — Use in scripts, CLI tools, or any blocking context. Simpleclient.get(),client.add(),client.details(). - Async (
AsyncOctaMem) — Use in async frameworks (FastAPI, asyncio, etc.). Same API withawait.
Same capabilities; pick the style that matches your stack.
Async example
import asyncio
from octamem import AsyncOctaMem
async def main():
client = AsyncOctaMem(api_key="your-api-key")
results = await client.get(
query="What are the user's notification preferences?",
previous_context="User onboarding flow",
)
await client.add(
content="User chose weekly digest emails and in-app notifications only.",
previous_context="User onboarding flow",
)
details = await client.details()
print(details)
await client.close()
asyncio.run(main())Or with a context manager:
async with AsyncOctaMem(api_key="your-api-key") as client:
results = await client.get(
query="What did we decide about the launch?",
previous_context="Product roadmap sync",
)
await client.add(
content="Ship MVP by end of Q2. Skip dark mode for v1.",
previous_context="Product roadmap sync",
)
details = await client.details()
print(details)Configuration
Constructor options (same for OctaMem and AsyncOctaMem). All except api_key are keyword-only.
- api_key
os.environ.get("OCTAMEM_API_KEY")str | NoneAPI key. Prefer env for security. - base_url
https://platform.octamem.comstrAPI base URL. - timeout
30.0floatRequest timeout in seconds. - max_retries
3intRetries on connection / timeout / rate-limit / 5xx. - retry_delay
1.0floatBase delay between retries (seconds).
# With explicit api_key (or omit to use OCTAMEM_API_KEY env var)
client = OctaMem(
api_key="your-api-key",
base_url="https://platform.octamem.com",
timeout=30.0,
max_retries=3,
retry_delay=1.0,
)Error handling and retries
The client retries on connection errors, timeouts, rate limits, and 5xx responses. You can still handle specific errors and implement custom retry logic.
Catch specific errors
from octamem import (
OctaMem,
AuthenticationError,
RateLimitError,
APIConnectionError,
ValidationError,
)
client = OctaMem(api_key="your-api-key")
try:
client.add(content="New note", previous_context="")
except AuthenticationError:
print("Invalid or missing API key")
except RateLimitError as e:
print(f"Rate limited. Retry after: {e.retry_after}s")
except APIConnectionError:
print("Network error")
except ValidationError as e:
print(f"Validation error: {e.message}")Retry on rate limit (example)
The SDK already retries with backoff; if you want custom logic (e.g. log and retry once):
import time
from octamem import OctaMem, RateLimitError
client = OctaMem(api_key="your-api-key")
def add_with_retry(content: str, previous_context: str = "", max_attempts: int = 2):
for attempt in range(max_attempts):
try:
return client.add(content=content, previous_context=previous_context)
except RateLimitError as e:
if attempt == max_attempts - 1:
raise
wait = e.retry_after or 60
time.sleep(wait)
add_with_retry("Important note", previous_context="")Exception hierarchy
OctaMemError (base)
├── AuthenticationError # 401
├── MethodNotSupportedError # 403
├── NotFoundError # 404
├── ValidationError # 400
├── RateLimitError # 429 (has .retry_after)
├── APIError # Other API errors
├── APIConnectionError # Network
├── APITimeoutError # Timeout
└── InternalServerError # 5xxContext managers
Both clients support context managers so connections are closed cleanly.
with OctaMem(api_key="your-api-key") as client:
results = client.get(
query="What did the customer ask for?",
previous_context="Support chat session #8823",
)
details = client.details()
print(details)async with AsyncOctaMem(api_key="your-api-key") as client:
results = await client.get(
query="What were the action items?",
previous_context="Weekly standup March 15",
)
await client.add(
content="Alice: finish API docs. Bob: review auth flow.",
previous_context="Weekly standup March 15",
)
details = await client.details()
print(details)Type safety
The SDK is fully typed (PEP 561, py.typed). Exported types: Params: GetParams (query, previousContext), AddParams (content, previousContext). Responses: SearchResult, AddResult, MemoryDetails. In params dicts the key is previousContext (camelCase) to match the API.
from octamem import (
OctaMem,
GetParams,
AddParams,
MemoryDetails,
SearchResult,
AddResult,
)
client = OctaMem(api_key="your-api-key")
params: GetParams = {
"query": "What did we decide about pricing?",
"previousContext": "Sales strategy call",
}
results: SearchResult = client.get(params)
add_params: AddParams = {
"content": "Free tier: 10k memories. Pro: unlimited.",
"previousContext": "Sales strategy call",
}
out: AddResult = client.add(add_params)
details: MemoryDetails = client.details()
print(details)Optional: previous_context
Both get and add accept an optional previous_context argument (sent as previousContext in the request body). Use it to scope or link memories to a conversation or topic.
from octamem import OctaMem
client = OctaMem(api_key="your-api-key")
results = client.get(
query="What did the user prefer for notifications?",
previous_context="Onboarding conversation",
)
add = client.add(
content="User selected email digest weekly and Slack for alerts.",
previous_context="Onboarding conversation",
)
details = client.details()
print(details)Requirements
- Python 3.9+
httpxpydantic
Why OctaMem?
- Memory as a service — No need to run your own vector DB or manage embeddings; focus on your product.
- Natural-language search — Query with questions or phrases instead of building search pipelines.
- Built for AI — Designed for agents and chatbots that need persistent, searchable memory across sessions.
- Simple API — Three main operations: search, add, details. Sync and async, with retries and type safety out of the box.