Persistent memory for AI agents solves a real problem (the goldfish-with-a-PhD problem) but it introduces a new one: a high-trust, cross-session, cross-agent data store sitting inside the LLM’s context loop. Every recall is content that flows into a prompt. Every store is content that came from somewhere — sometimes the user, sometimes the model, sometimes a tool result that originated externally.

That’s a threat model worth writing down before the data store grows up. This post is the threat model for memstore — the persistent memory system I built for Claude Code — and the controls I’m applying or planning.

Scope

This is a single-user, single-host threat model. Memstore today is deployed as one daemon serving one person’s namespaces, and every threat below assumes that boundary.

Multi-user and multi-org are deferred, but planned. Some of the infrastructure is already in place — memstored has shipped a Postgres-backed central store and a network daemon for a while, and the recently added API-key authentication is the first real client-identity primitive — but the threat model isn’t there yet, and the architecture isn’t either. Memstore will be multi-user, but not until that case has been thought through end to end.

The broad design intent is for individuals or organizations to run their own memstored instance rather than cross-org hosting on a shared service. Tenant separation lives at the deployment boundary, which avoids the worst of the shared-host attack surface (operator/insider risk, cross-tenant resource exhaustion, compliance-boundary attestation across organizations). Where a single deployment does host multiple organizations, the planned simplification is a separate Postgres namespace per org — keeping tenant boundaries at the storage layer rather than rebuilding them in daemon code. Threats specific to the multi-user case — cross-tenant isolation under one daemon, audit logging across namespaces, embedding-model isolation, key rotation — get their own threat-model post when the architecture catches up.

Assets

  • User facts: identity, preferences, project context, credentials-adjacent metadata.
  • Project facts: design decisions, source-of-truth pointers, operational state.
  • Task state: in-progress work, deferred follow-ups.
  • Session transcripts: full or partial conversation history (in the warm/cool tiers).
  • Embeddings: vector representations of all of the above.

Trust Boundaries

  • Hooks → memstore: the hook is a shell script run by the harness. It receives untrusted content (file paths being read, prompts being submitted) and forwards subsets to memstore. The hook is privileged code reading untrusted data.
  • MCP client → memstore daemon: an LLM-driven MCP client makes recall and store calls. The client is operating on behalf of the user but is not the user. Its decisions are model-driven.
  • Memstore daemon → SQLite: the only fully-trusted boundary. Even here, full-text search is an attacker-influenced query path.
  • Memstore daemon → Ollama (embedding): outbound dependency. Embedding model can be poisoned, replaced, or fail. Embeddings affect retrieval ordering but not stored content.

Threat Inventory

T1: Memory Poisoning via Untrusted Content

Attack: an attacker inserts a fact via a path that ultimately reaches a memory_store call. Most likely vector: a model-driven extraction pipeline summarizing a conversation that included adversarial content from a tool result.

Impact: the poisoned fact surfaces in future sessions as if it were authoritative. Future model behavior is influenced by attacker-controlled context.

Mitigations:

  • Provenance metadata (planned): record each fact’s source (user-explicit, model-extracted, hook-injected, tool-result) so downstream ranking and gating can use it.
  • Provenance-aware ranking (planned, depends on the above): once provenance is captured, model-extracted facts should decay faster than user-confirmed ones. Current decay is category-based (note = 30-day half-life; preference and identity don’t decay), which is the closest proxy until provenance lands.
  • Confirmation signal (partial): the memory_confirm tool tracks confirmation count and last-confirmed timestamp, providing a trust signal recall can weight by. It does not currently gate promotion of high-impact facts; gating is planned.

T2: Cross-Project Leakage

Attack: facts stored under one project surface in another project’s context. May be benign drift; may be intentional exfiltration if a hostile prompt induces the model to query across projects.

Impact: data crosses an organizational or confidentiality boundary the user did not intend.

Mitigations:

  • Namespace isolation enforced at the daemon (shipped): all reads and writes are scoped to the caller’s namespace via SQL filters in search.go and sqlite.go. The client cannot override.
  • Cross-namespace queries gated by explicit flag (shipped): SearchOpts.AllNamespaces and SearchOpts.Namespaces must be set explicitly to widen scope. A persistent audit log of cross-namespace queries is planned but not implemented.
  • Default recall scoping (shipped): without an explicit override, recall returns only the caller’s namespace.

T3: Recall as a Side Channel

Attack: a model can be induced to perform recall queries that exfiltrate the existence or content of facts to the conversation, where they may be captured by adversarial content downstream.

Impact: facts the user expected to stay in memory leak into transcripts.

Mitigations:

  • Recall results that include sensitive provenance are flagged and the model is instructed not to echo them verbatim.
  • Audit log of recalls per session — surface anomalous query volume.
  • Considering: a “private” classification that recalls return a hint about, not the content of.

T4: Embedding Model Compromise

Attack: the local embedding model (Ollama) is replaced or poisoned to systematically misrank certain facts (suppress, surface, or cluster maliciously).

Impact: retrieval bias that’s hard to detect without ground-truth comparison.

Mitigations:

  • Embedding model lock on first use — model identity recorded with each embedding.
  • Re-embedding triggers on model change, with diff reporting.
  • Hybrid search (FTS5 + vector) means a poisoned embedding alone can’t fully suppress a fact — full-text retrieval still hits.

T5: Supersession Chain Tampering

Attack: an attacker stores a “superseding” fact that overrides a true historical fact, effectively rewriting memstore’s view of history.

Impact: the model acts on a false current state; the true historical fact is still in the chain but never surfaced.

Mitigations:

  • Supersession requires referencing the prior fact ID — random new facts can’t silently override.
  • High-impact subjects (matthew, identity, credentials-adjacent) flag any supersession for confirmation.
  • History queries always traverse the full chain on request.

T6: SQL Injection via FTS5 Queries

Attack: model-supplied FTS5 query strings contain syntax that breaks out of the search context.

Impact: data exfiltration or modification.

Mitigations:

  • Parameterized queries throughout; no string concatenation into SQL.
  • FTS5 query syntax sanitization at the API boundary.
  • Defense-in-depth: SQLite read/write user separation in daemon mode (planned).

T7: Hook Argument Injection

Attack: file paths or prompt content reaching hooks contain shell metacharacters.

Impact: arbitrary command execution as the hook user.

Mitigations:

  • Hook scripts use exec-style invocation, never shell-string concatenation.
  • All untrusted arguments passed via stdin or argv, never via shell expansion.
  • Hooks are version-locked via memstore setup.

What I’m Deferring

  • Encryption at rest. Local-only deployment, full-disk-encrypted host. Acceptable for personal use; not for shared/multi-tenant deployment.
  • Per-fact ACLs. Namespace isolation is currently the only access control. Fine-grained ACLs are a separate feature.
  • Cryptographic provenance (signed facts). Plausible future work; not justified yet.

Closing

The threat model for an AI memory store is mostly the threat model for any database that takes attacker-influenced writes and serves them back into a privileged context. What’s specific to AI memory is that the privileged context is an LLM prompt, the writes are often model-generated, and the read path involves a vector model that can itself be compromised. The defenses look like database security plus content provenance plus a healthy fear of the loop closing on itself.


Related posts: Defense in Depth for AI Agents (the general framing this post applies to a specific system)

Related projects: memstore (the persistent memory system this threat model describes)