An infrastructure for everything your agents shouldn’t forget.
Memory is
all you need.
The persistent memory layer for AI agents. Built for the stacks where forgetting isn’t an option.
No card. 2 GB on the free tier.- OpenAI
- MCP
- Cursor
- LangGraph
- Vercel AI SDK
AI systems have no memory.
Abstract
№ 01Every session starts from zero. Every agent starts blind. These failure modes compound in production. They look like edge cases until you measure them across a quarter.
- 01
Users repeat themselves every session.
A customer explained their setup last week. And the week before. The agent has no idea. Every session, a blank slate.
- 02
Agents forget the decisions they made.
An agent approves a workflow on Monday. By Wednesday it suggests the opposite. Without memory, there's no consistency, and no accountability.
- 03
Context windows overflow silently.
Critical details get pushed out as conversations grow. The model stops seeing what matters. Errors compound, undetected.
- 04
Multi-agent systems can’t share what they know.
Agent A learns something useful. Agent B has no access to it. Deploy five agents, get five isolated silos.
- 05
Every workaround is a hack.
Prompt stuffing. JSON blobs in Redis. Vector search over raw transcripts. They break at scale and nobody trusts them in production.
Memory has to be infrastructure — not a patch.
See how the architecture solves it§ From failure to system
Memory in motion.
Every request passes through the same disciplined cycle. OctaMem doesn’t fire a generic search across one bucket of text. It rebuilds context from three memory types that each serve a distinct purpose, then reassembles them for the model.
Search cycle · in motion
Stage 01 / 05- 01 / Caller
App or MCP
- 02 / Access · quota
Security layer
- 03 / OctaMem agent
Retrieval service
- 04 / Three layers
Memory layers
- 05 / Back to app
Unified context
Any file. Now memory.
Hand OctaMem the document itself. Contracts, decks, spreadsheets, emails, PDFs. We parse, structure, and store it as typed memory your agents can query forever.
Not embeddings of a blob. Clauses, parties, obligations.
- Batch upload
- 5 files
- Avg pages
- 40
- Max file
- 30 MB
- Retention
- Configurable
Drop the file. Memory does the rest.
Parsed memory record
contract-v3.pdfMaster Services Agreement,
v3 · executed 2026-04-12
- ›parties: Acme Corp, OctaMem Inc.
- ›term: 24 months, auto-renew 12
- ›obligations: 99.9% uptime SLA, 30-day deletion
Searchable across the account under previous_context: legal-msas.
01 / category
Documents
Contracts, briefs, reports, knowledge bases.
Supported
- .docx
- .docm
- .dotx
- .dotm
- .odt
- .rtf
- .txt
02 / category
Spreadsheets
Tables, ledgers, datasets — reasoned over rows.
Supported
- .xlsx
- .xls
- .csv
03 / category
Presentations
Decks parsed slide by slide. Factual recall, not pixels.
Supported
- .pptx
04 / category
Email & Data
Threads, payloads, structured exports.
Supported
- .eml
- .json
§ From input to inheritance
Intelligence that compounds.
Every session without memory is a reset. Every session with memory is an upgrade.
Day 1
Recognition
Day 30
Pattern awareness
Day 180
Operational depth
- 0Day 1
Recognition.
Names, preferences, initial constraints. Conversations feel slightly personalized. The kind a thoughtful intern manages on day one.
- 0Day 30
Pattern awareness.
The agent remembers your decisions, avoids past mistakes, and follows your workflows without repeated instruction. Fewer questions, fewer corrections.
- 0Day 180
Operational depth.
Deep institutional context. The agent operates with continuity across teams, releases, and tools. A system of record your AI can actually use.
Day 360 isn’t on the chart. The curve keeps climbing.
+ Compounds with every session
One memory layer. Two paths.
Start on the general cloud, or run on a vertical-specific memory cloud tuned to your sector’s schemas, policies, and compliance posture.
One memory layer
- Ssemanticfacts & knowledge
- Eepisodicevents & history
- Pproceduralworkflows & rules
- 01 / FACTS & KNOWLEDGE
Semantic.
Stable knowledge, preferences, account facts, business rules, domain context. Relationship-aware retrieval over a graph store.
- Cycle
- Read-heavy
- 02 / EVENTS & HISTORY
Episodic.
What happened, when, and why. Time-based retrieval gives the model a sense of history, ordered and explainable.
- Cycle
- Append-only
- 03 / WORKFLOWS & RULES
Procedural.
How work should be done, escalation paths, compliance workflows, recurring routines. Retrieved by intent.
- Cycle
- Versioned
Foundation
General Memory Cloud.
Sector-agnostic persistent memory for any agent workflow. Model-agnostic. Protocol-native.
- Cross-model memory under one account
- High-recall retrieval and context rebuild
- API and MCP-compatible access
- Semantic, episodic, and procedural memory types
- Granular deletion controls
Vertical clouds
In progressSpecialized Memory Clouds.
In progress
Domain-aware memory structures, sector-specific behavioural models, and compliance-aligned memory policies.
- Everything in General Memory Cloud
- Domain-specific memory schemas and retrieval
- Sector-aware behavioural continuity
- Policy-bound enforcement and guardrails
- Vertical-optimized context rebuild
- Priority support and deployment options
Healthcare, Legal, and Defense memory clouds in development.
Built for real systems.
The same memory layer, accessed however your team already builds. No bespoke vertical stack. No rewrite. The platform shapes to the workflow, not the other way around.
Coverage at a glance
- Verticals
08
Healthcare, finance, defense, public sector.
- Runtimes
08
REST, MCP, SDKs, IDE plugins.
- Memory layer
01
Unified across stacks.
- Stack rewrites
00
Drop in through existing interfaces.
Enterprise verticals
01
Healthcare
Patient continuity across visits. Treatment history that persists across care teams and sessions.
02
Legal
Case memory and precedent tracking. Client interaction continuity across matters.
03
Finance
Portfolio context, trade history, and risk awareness that compounds across agent sessions.
04
Insurance
Claims history and policy context. Adjuster memory that carries across every touchpoint.
05
Defense
Mission context that persists across briefings, operations, and multi-agent coordination.
06
Technology
Product context, customer success history, and engineering knowledge that carries across teams and releases.
07
Retail & logistics
Inventory, fulfillment, and partner memory across channels, warehouses, and agent-assisted operations.
08
And more…
Energy, media, public sector, and other high-stakes domains — we'll shape memory around your workflows.
Builder workflows
REST API
Docs →Direct HTTP endpoints. Full control without MCP or an SDK.
MCP Server
Docs →Remote MCP — use OctaMem from any MCP-compatible assistant or agent.
Claude Desktop
Docs →Connectors in Settings, or config file with Node on Mac and Windows.
Cursor
Docs →Tools & MCP in Cursor settings, or mcp.json — Mac, Windows, and Linux.
Claude.ai (Web)
Docs →Custom MCP connector URL in the web app.
OpenClaw
Docs →Plugin with auto-recall and capture for open agent stacks.
Python SDK
Docs →pip install octamem — typed client for scripts and services.
JavaScript SDK
Docs →npm install @octamem/octamem-js — for Node, browsers, Deno, and Bun.
§ From market to stack
Built for the high-stakes stack.
When memory integrity matters, when decisions need traceability, when continuity is not optional. OctaMem is the layer your security, compliance, and infrastructure teams will actually approve.
- Policy
Policy-aware memory.
Agents respect organizational rules, constraints, and boundaries embedded in the memory layer. Not in the prompt, not in the model.
Role-based access · Scoped retrieval · Tenant isolation
- Audit
Audit-ready continuity.
Every memory write and read is traceable. Full lineage from source document to model output, with cryptographic integrity for regulated environments.
Immutable audit log · Source attribution · Retention policies
- Access
Role-based memory access.
Teams control who sees what. Memory isolation between departments, projects, and roles, enforced at the storage layer, not the application.
SSO · SCIM provisioning · Department scopes
- Deploy
Deploy where you must.
Cloud, private cloud, or on-premise. Single-tenant, dedicated keys, and customer-managed encryption available for the highest-stakes environments.
Cloud · VPC · On-prem · BYO-KMS
Same memory. Five runtimes.
The full integration. No vector DB to operate. No embedding pipeline to maintain. No chunking. OctaMem holds the memory; you keep your stack — Python, JavaScript, REST, or MCP.
- ›add(). Capture a memory with its previous context.
- ›get() / search(). Recall it from any agent, any session.
- ›MCP. Same operations as tool-calls in any MCP-compatible client.
from octamem import OctaMem
# Your API key from platform.octamem.com.
client = OctaMem(api_key="sk-om-live-...")
# Capture a memory.
client.add(
content="Beta opens March 20.",
previous_context="Q1 product launch",
)
# Recall it later, possibly from a different agent.
results = client.get(
query="When does beta open?",
previous_context="Q1 product launch",
)
print(results){
"results": [
{ "id": "rec_01HV4Z…", "type": "semantic", "score": 0.94,
"content": "Beta opens March 20.",
"source": "planning_doc_q1", "created_at": "2026-02-14T09:12Z" },
{ "id": "rec_01HV7M…", "type": "episodic", "score": 0.88,
"content": "Approved Q1 scope reduction on 2026-02-09." },
{ "id": "rec_01HV9F…", "type": "procedural", "score": 0.81 }
],
"tokens": 642, "previous_context": "Q1 product launch"
}Source-linked. Every record carries id, type, score, content, and source — auditable end-to-end, deletable by id or by previous_context.
p50 retrieve
84ms
Edge-cached search across all three memory layers, single region.
p99 retrieve
210ms
Worst-case end-to-end, cold cache, with reranking.
Write ack
32ms
Synchronous acknowledgement before async embedding & indexing.
§ From code to control
Your memory.
Your control.
Memory is sensitive. See what is stored, keep it structured and traceable, and delete it whenever you want. No opaque embeddings. No locked-in vendor format.
Audit chain
Every memory action leaves a mark.
Reads, writes, redactions, and policy checks are chained together so the record can be inspected after the fact.
Active event
context.delivered
hash: sha256:ad72f9019c
- EVT_4182prev: 0b91ce774a
recall.requested
agent:legal-copilotacme/legal/msas - EVT_4183prev: 8f4a2c91b0
policy.checked
policy:contract-scoperedact: pricing / pii - EVT_4184prev: 1c68bd044e
context.delivered
octamem:renderer642 tokens / 7 sources - EVT_4185prev: ad72f9019c
memory.captured
agent:legal-copilotretention: 365 days
- i.
Visible.
See exactly what your agents remember. Every record, every source, every change. No black box.
- ii.
Structured.
Memory is typed, tagged, and traceable. Each record carries provenance. Not a blob of embeddings.
- iii.
Deletable.
Remove any memory at any time, by record or by previous_context. Forget on command.
Compliance posture
4 frameworks- In progress
SOC 2 Type II
Controls in place · bridge letter on request
- Available
HIPAA-Ready
BAA available on Enterprise
- Available
GDPR / DPA
DPA available · EU residency in eu-west
- In progress
ISO 27001
Controls in place · audit in progress
Infrastructure
Encryption
AES-256-GCM at rest · TLS 1.3 in transit · BYO-KMS on Enterprise
Access control
Role-based scopes · SSO via Okta, Entra, Google · SCIM
Audit logging
Immutable append-only log · per-record provenance
Retention
Configurable windows · scoped delete by record or context
Observability
Per-tenant metrics, latency, error budgets · Datadog export
Resilience
Multi-AZ · RTO 30m / RPO 5m
Deployment
01
Multi-tenant cloud
us-east, eu-west, ap-southeast.
02
VPC peering
Private network ingress on Enterprise.
03
Single-tenant
Dedicated cluster, dedicated keys.
04
On-prem / air-gapped
Container image, customer-managed stores.
The questions
we always get.
Six of the most common things buyers ask in the first conversation. If yours isn’t here, send us a noteand we’ll add it.
A vector database is an index. OctaMem is a record. We hold typed memory with provenance, audit, and policy — the things a vector store leaves to you. You can keep your vector DB; pass the hits as
sourceswhen you write memory back. Vectors stay your index. OctaMem stays your record of truth.Three things, in order of importance. (1) Memory survives across sessions, models, and tools — not just the current conversation window. (2) Memory is typed: semantic facts, episodic events, procedural rules — each retrievable on its own. (3) Memory is auditable: every write and read carries a source, a scope, and a traceable lineage your security team can inspect.
Every record has a stable id, type, content, source, and
previous_context. The full export is plain JSON, available via the REST API, the SDKs, or as a snapshot file on Enterprise plans. No proprietary format. No lock-in.Yes, on Enterprise. We ship a self-contained deployment with your team’s hardware specs. Customer-managed Postgres + Neo4j + Elasticsearch. Single-tenant cloud and customer VPC are also options. See Trust & security for the full deployment matrix.
Three plans for organizations: Team at $1,200/mo, Business at $4,800/mo, Enterprise on contract. Individual builders use the developer tiers from $0 to $14/mo. Free tier is real: 2 GB of memory, full SDK access, no card required. See pricing for the breakdown.
Any of them. OctaMem doesn’t run the model — you bring your own LLM keys for OpenAI or self-hosted. We run the memory layer; you keep the model relationship. Switch model providers tomorrow and your memory still works.
A closing note
Stop resetting.
Start remembering.
Persistent memory infrastructure for every agent, every model, every workflow your organization runs. With the audit trail your security team requires and the simplicity your developers expect.
No card required · Free tier includes 2 GB memory