A typed, governed, replayable memory layer.

Every agent request flows through OctaMem before it reaches a model. Retrieval is structured, capture is automatic, and every read is traceable. This is what we mean when we say memory as infrastructure.

Request ledger

Watch memory become infrastructure.

Each request moves through a controlled path before it becomes model context: identity, retrieval, policy, and render.

request.opened

The request arrives with previous_context, actor, and budget.

Nothing reaches the model until the memory layer knows who is asking, what they can see, and how much context it can spend.

/actor: agent:sales-copilot

/previous_context: sales-q1

/budget: 3,200 tokens

Records selected

fact
HIPAA-safe defaults required for this account.
decision
V2 proposal rejected because reporting exposed PHI.
rule
Route contract deltas through legal above $10k.

Context packet

Preserve approved scope. Redact PHI. Cite rejected V2 reporting. Capture final launch decision after response.

OctaMem sits between agents and the models they call. Every request flows through OctaMem to be enriched with relevant memory before it reaches the LLM. Every response is captured back as typed memory, audited, and made available to every agent on the account.

The memory layer holds three independent stores. Semantic stores facts. Episodic stores events. Procedural stores behaviour. Each is optimized for the shape of knowledge it preserves, and together they replace the workarounds nobody trusts in production: prompt stuffing, Redis caches, transcript replay, vector recall over raw conversations.

i.
Recall.
Typed retrieval with provenance. Every record carries a source, a timestamp, and a scope. Not a vector blob.
ii.
Inject.
Memory is rendered as structured context, not stuffed into the prompt. The model sees what matters in the right shape.
iii.
Capture.
Decisions become memory, with full lineage from the documents and conversations that produced them.
iv.
Govern.
Policy, audit, and access are enforced at the memory layer. Not the prompt, not the application code.

§ 03Retrieval

Scoped, typed, replayable.

Recall is a typed query, not a similarity search. You ask for what you need by scope, type, and time, and you get records with sources you can trace back to the documents or conversations that produced them.

›query: natural-language question or keywords
›previous_context: scopes results to a thread, topic, or tenant
›api_key: per-environment, per-team isolation
›details(): live storage, plan limits, balance

agents/recall_q1.pypython

# Retrieve memory for a specific conversation or topic.
results = client.get(
    query="What did the customer reject in v2?",
    previous_context="Q1 product planning",
)

# Each item carries content, metadata, and provenance.
for item in results:
    print(item.id, item.created_at, item.content)

Each item carries id, content, created_at, and provenance.

§ 04Policy & audit

Enforced at the storage layer. Not the prompt.

The model can’t leak what it never saw. Read scopes, write scopes, and redactions are applied inside OctaMem before a single token is rendered into context. The audit log is append-only and hash-chained, every entry references the previous, so tampering is detectable.

›Role-based scopes (department, project, agent)
›Field-level redaction (PII, PHI, pricing, IP)
›Immutable audit log with hash-chained entries
›Retention policies with automatic expiry

policy/legal_scope.pypython

# Policy is enforced at the storage layer, not the prompt.
client.policy.set(
    previous_context="legal-review",
    rules=[
        # Only legal team agents can write under this label.
        Allow(role="legal-agent", action="write"),
        # Customer-services agents can read but not see PII.
        Allow(role="cs-agent", action="read", redact=["pii"]),
        # Everything else is denied by default.
        DenyAll(),
    ],
)

Same policy DSL applies to REST, MCP, Python, and JS clients.

§ 06How it compares

Memory layer. Not a vector wrapper.

The shortest way to explain OctaMem is the things it’s deliberately not. Competitors solve retrieval accuracy. OctaMem solves retrieval accountability.

OCTAMEM

VECTOR DB

CHAT MEMORY LIBS

PROMPT STUFFING

Typed records

Semantic / episodic / procedural

Untyped vectors

Conversation buffers

Inline string

Source provenance

Every record source-linked

Optional metadata

Session-bound

None

Audit log

Append-only, hash-chained

Custom build

Usually missing

None

Policy enforcement

Storage layer · redaction

Application layer

Prompt-engineered

Model-agnostic

Any LLM, any SDK

Yes

Often model-coupled

Per-model

Cost shape

Linear in storage

Linear + embed cost

Token-based

Compounds per call

fig. 4 · positioning matrix. Six dimensions that matter when an audit team asks.

§ 07Deployment

Cloud

Cloud.

Multi-tenant cloud with regional residency. Default for Team and Business plans.

us-east · eu-west · ap-southeast

VPC

VPC.

Single-tenant cloud in your own VPC. Customer-managed network and firewall.

AWS · GCP · Azure

On-prem

On-prem.

Self-contained deployment for air-gapped or regulated environments.

Kubernetes · OpenShift · bare metal

Trust & security Talk to deployment

A typed, governed, replayable memory layer.

Watch memory become infrastructure.

Recall.

Inject.

Capture.