Threat Modeling a Persistent Memory Store for AI Agents

By Matthew Hunter | May 12, 2026 | ai, security, memstore, mcp, threat-modeling

Persistent memory for AI agents solves a real problem (the goldfish-with-a-PhD problem) but it introduces a new one: a high-trust, cross-session, cross-agent data store sitting inside the LLM’s context loop. Every recall is content that flows into a prompt. Every store is content that came from somewhere — sometimes the user, sometimes the model, sometimes a tool result that originated externally.

That’s a threat model worth writing down before the data store grows up. This post is the threat model for memstore — the persistent memory system I built for Claude Code — and the controls I’m applying or planning.

Scope

This is a single-user, single-host threat model. Memstore today is deployed as one daemon serving one person’s namespaces, and every threat below assumes that boundary.

Multi-user and multi-org are deferred, but planned. Some of the infrastructure is already in place — memstored has shipped a Postgres-backed central store and a network daemon for a while, and the recently added API-key authentication is the first real client-identity primitive — but the threat model isn’t there yet, and the architecture isn’t either. Memstore will be multi-user, but not until that case has been thought through end to end.

The broad design intent is for individuals or organizations to run their own memstored instance rather than cross-org hosting on a shared service. Tenant separation lives at the deployment boundary, which avoids the worst of the shared-host attack surface (operator/insider risk, cross-tenant resource exhaustion, compliance-boundary attestation across organizations). Where a single deployment does host multiple organizations, the planned simplification is a separate Postgres namespace per org — keeping tenant boundaries at the storage layer rather than rebuilding them in daemon code. Threats specific to the multi-user case — cross-tenant isolation under one daemon, audit logging across namespaces, embedding-model isolation, key rotation — get their own threat-model post when the architecture catches up.

Assets

User facts: identity, preferences, project context, credentials-adjacent metadata.
Project facts: design decisions, source-of-truth pointers, operational state.
Task state: in-progress work, deferred follow-ups.
Session transcripts: full or partial conversation history (in the warm/cool tiers).
Embeddings: vector representations of all of the above.

Trust Boundaries

Hooks → memstore: the hook is a shell script run by the harness. It receives untrusted content (file paths being read, prompts being submitted) and forwards subsets to memstore. The hook is privileged code reading untrusted data.
MCP client → memstore daemon: an LLM-driven MCP client makes recall and store calls. The client is operating on behalf of the user but is not the user. Its decisions are model-driven.
Memstore daemon → SQLite: the only fully-trusted boundary. Even here, full-text search is an attacker-influenced query path.
Memstore daemon → Ollama (embedding): outbound dependency. Embedding model can be poisoned, replaced, or fail. Embeddings affect retrieval ordering but not stored content.

Threat Inventory

T1: Memory Poisoning via Untrusted Content

Attack: an attacker inserts a fact via a path that ultimately reaches a memory_store call. Most likely vector: a model-driven extraction pipeline summarizing a conversation that included adversarial content from a tool result.

Impact: the poisoned fact surfaces in future sessions as if it were authoritative. Future model behavior is influenced by attacker-controlled context.

Mitigations:

Provenance metadata (planned): record each fact’s source (user-explicit, model-extracted, hook-injected, tool-result) so downstream ranking and gating can use it.
Provenance-aware ranking (planned, depends on the above): once provenance is captured, model-extracted facts should decay faster than user-confirmed ones. Current decay is category-based (note = 30-day half-life; preference and identity don’t decay), which is the closest proxy until provenance lands.
Confirmation signal (partial): the memory_confirm tool tracks confirmation count and last-confirmed timestamp, providing a trust signal recall can weight by. It does not currently gate promotion of high-impact facts; gating is planned.

T2: Cross-Project Leakage

Attack: facts stored under one project surface in another project’s context. May be benign drift; may be intentional exfiltration if a hostile prompt induces the model to query across projects.

Impact: data crosses an organizational or confidentiality boundary the user did not intend.

Mitigations:

Namespace isolation enforced at the daemon (shipped): all reads and writes are scoped to the caller’s namespace via SQL filters in search.go and sqlite.go. The client cannot override.
Cross-namespace queries gated by explicit flag (shipped): SearchOpts.AllNamespaces and SearchOpts.Namespaces must be set explicitly to widen scope. A persistent audit log of cross-namespace queries is planned but not implemented.
Default recall scoping (shipped): without an explicit override, recall returns only the caller’s namespace.

T3: Recall as a Side Channel

Attack: a model can be induced to perform recall queries that exfiltrate the existence or content of facts to the conversation, where they may be captured by adversarial content downstream.

Impact: facts the user expected to stay in memory leak into transcripts.

Mitigations:

Recall results that include sensitive provenance are flagged and the model is instructed not to echo them verbatim.
Audit log of recalls per session — surface anomalous query volume.
Considering: a “private” classification that recalls return a hint about, not the content of.

T4: Embedding Model Compromise

Attack: the local embedding model (Ollama) is replaced or poisoned to systematically misrank certain facts (suppress, surface, or cluster maliciously).

Impact: retrieval bias that’s hard to detect without ground-truth comparison.

Mitigations:

Embedding model lock on first use — model identity recorded with each embedding.
Re-embedding triggers on model change, with diff reporting.
Hybrid search (FTS5 + vector) means a poisoned embedding alone can’t fully suppress a fact — full-text retrieval still hits.

T5: Supersession Chain Tampering

Attack: an attacker stores a “superseding” fact that overrides a true historical fact, effectively rewriting memstore’s view of history.

Impact: the model acts on a false current state; the true historical fact is still in the chain but never surfaced.

Mitigations:

Supersession requires referencing the prior fact ID — random new facts can’t silently override.
High-impact subjects (matthew, identity, credentials-adjacent) flag any supersession for confirmation.
History queries always traverse the full chain on request.

T6: SQL Injection via FTS5 Queries

Attack: model-supplied FTS5 query strings contain syntax that breaks out of the search context.

Impact: data exfiltration or modification.

Mitigations:

Parameterized queries throughout; no string concatenation into SQL.
FTS5 query syntax sanitization at the API boundary.
Defense-in-depth: SQLite read/write user separation in daemon mode (planned).

T7: Hook Argument Injection

Attack: file paths or prompt content reaching hooks contain shell metacharacters.

Impact: arbitrary command execution as the hook user.

Mitigations:

Hook scripts use exec-style invocation, never shell-string concatenation.
All untrusted arguments passed via stdin or argv, never via shell expansion.
Hooks are version-locked via memstore setup.

What I’m Deferring

Encryption at rest. Local-only deployment, full-disk-encrypted host. Acceptable for personal use; not for shared/multi-tenant deployment.
Per-fact ACLs. Namespace isolation is currently the only access control. Fine-grained ACLs are a separate feature.
Cryptographic provenance (signed facts). Plausible future work; not justified yet.

Closing

The threat model for an AI memory store is mostly the threat model for any database that takes attacker-influenced writes and serves them back into a privileged context. What’s specific to AI memory is that the privileged context is an LLM prompt, the writes are often model-generated, and the read path involves a vector model that can itself be compromised. The defenses look like database security plus content provenance plus a healthy fear of the loop closing on itself.

Related posts: Defense in Depth for AI Agents (the general framing this post applies to a specific system)

Related projects: memstore (the persistent memory system this threat model describes)

Certifications

Publications

Threat Modeling a Persistent Memory Store for AI Agents

Scope

Assets

Trust Boundaries

Threat Inventory

T1: Memory Poisoning via Untrusted Content

T2: Cross-Project Leakage

T3: Recall as a Side Channel

T4: Embedding Model Compromise

T5: Supersession Chain Tampering

T6: SQL Injection via FTS5 Queries

T7: Hook Argument Injection

What I’m Deferring

Closing

Newsletter

Recent Posts

categories

tags

About

Certifications

Publications

Navigation

Newsletter

Recent Posts

categories

tags