Proactive Context Injection with Claude Code Hooks

By Matthew Hunter | Mar 24, 2026 | claude-code, mcp, memstore

Claude Code sessions start with amnesia. Every conversation begins cold — no memory of what you worked on yesterday, what invariants matter in the file you’re about to edit, or what tasks are still pending from last week. CLAUDE.md helps by injecting static project context, and MCP tools let the model search for facts on demand. But both of those require something to go right first: either the static file happens to cover the relevant topic, or the model decides to search before acting. In practice, the model often doesn’t search. It plows ahead with what it has, and the most valuable context — the constraint you documented last Tuesday, the task you left half-finished — stays in the database unsurfaced. This post is about closing that gap with hooks: small scripts that fire at specific points in the Claude Code lifecycle and inject relevant context automatically, before the model even knows it needs it.

This is part of a series on memstore , a persistent memory system for AI agents. The flagship post covers the architecture and rationale. This one covers the delivery mechanism — how stored knowledge gets from the database into the conversation at the right moment.

The Problem: Context Is Expensive and Manual

LLM context windows are large but not free. Irrelevant context dilutes attention and burns tokens. The practical constraint isn’t “can I fit this” but “should I inject this” — every fact you add competes with the actual work for the model’s attention budget.

CLAUDE.md is what the Codified Context paper calls the hot tier: always loaded, always present. It’s the right place for project build commands, coding conventions, and the kind of standing instructions that apply to every session. But it can’t cover everything. A project with a hundred invariants, fifty design decisions, and a thousand symbol-level descriptions would need a CLAUDE.md the size of a novella. The model would spend half its context window reading instructions before doing any work.

The paper describes a three-tier model: hot context that’s always loaded, warm context triggered by specific files or tasks, and cold context searched on demand. CLAUDE.md handles hot. MCP tools handle cold. What was missing was warm — context that loads automatically based on what you’re doing, without anyone asking for it.

Claude Code’s hook system fills that gap.

Claude Code’s Hook System

Hooks are shell commands registered in .claude/settings.local.json that fire at specific points in a session’s lifecycle. There are five events: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, and SessionEnd. Each hook receives a JSON payload on stdin describing the event — the prompt text, the tool name and arguments, the session metadata — and returns JSON on stdout. The key field in the response is additionalContext: any text you put there gets injected into the conversation as a system reminder, visible to the model but not to the user.

The constraint that shapes everything is the default five-second timeout. Hooks must be fast. They can’t run a full RAG pipeline or call an external API with retry logic. They need to do one thing quickly and get out of the way. This pushes the design toward pre-computed results and local HTTP calls rather than complex processing at hook time.

Five Hooks, Five Purposes

SessionStart: Load Pending Tasks and Drift Warnings

The first thing that happens when a session opens is a pair of CLI calls:

memstore tasks --surface startup
memstore check-drift

The task surfacing solves the “where was I?” problem. Memstore tracks tasks with status (pending, in-progress, completed, cancelled) and scope (user TODOs, agent follow-ups, collaborative work). The startup hook queries for anything pending or in-progress and injects it as context. If you left three tasks half-finished yesterday, the new session knows about them immediately. No manual search, no re-explaining.

The drift check is subtler. Facts in memstore often reference source files — “this function works like X” or “the config format is Y.” If those files change after the fact was stored, the fact might be stale. check-drift compares file modification times against fact timestamps for the last seven days and flags anything that looks outdated. The seven-day scope keeps the output manageable; without it, a large codebase would generate drift warnings for every file touched since the project started.

UserPromptSubmit: Automatic Memory Recall on Every Prompt

This is the highest-traffic hook. Every prompt the user types triggers a single HTTP call to memstored’s /v1/recall endpoint, which typically completes in around 30ms.

The server-side processing is where the work happens. Rather than using the raw prompt as a search query, memstored runs IDF (inverse document frequency) keyword extraction to pick the most distinctive words. The word “function” appears in hundreds of facts and tells you nothing. The word “supersession” appears in a handful and strongly signals what the user cares about. The pipeline extracts the highest-IDF terms, runs per-keyword full-text search queries, and merges the results with a multi-keyword boost: facts matching several of your distinctive terms score higher than facts matching just one.

On top of keyword scoring, there’s a project boost for facts belonging to the current project, a file-context boost when the active file matches a fact’s source, and a feedback multiplier from the rating system — 1.0 + 0.3 * avg where avg ranges from -1.0 to +1.0. A fact that consistently gets positive feedback scores up to 1.3x; one the model finds unhelpful drops to 0.7x.

Session-level deduplication prevents the same fact from being injected twice. Memstored tracks which fact IDs have been included in recall responses for the current session. On a fifty-prompt session about authentication, the foundational auth design decision gets injected on the first relevant prompt. Subsequent prompts get increasingly specific facts that the earlier injections didn’t cover. The context window stays clean.

The hook skips short prompts under five words and slash commands, which are unlikely to benefit from memory recall. Output is budget-capped at 2000 characters by default, with individual facts truncated to 300 characters to keep the injection compact.

PreToolUse:Read — Surface Facts Before You Read a File

When Claude is about to read a file, the hook fires two calls. First, memstore list-file retrieves any file-level or symbol-level facts associated with that path — learned descriptions of functions, types, and their relationships. Second, memstore eval-triggers checks the file path against trigger patterns and loads matching subsystem context.

The trigger system is where the highest-value context lives. A trigger fact encodes a mapping: “when touching files matching this pattern, inject these constraints.” For example, reading sqlite.go in the memstore codebase auto-loads the invariant “scanFact and factColumns must stay in sync” — a cross-cutting constraint that isn’t visible in any single file but matters every time you edit the schema handling code. More on triggers below.

PreToolUse:Edit — Surface Constraints Before You Change a File

The Edit hook does everything the Read hook does, plus symbol inference from the edit context. The hook’s script runs a regex against the old_string field in the Edit tool’s arguments, extracting Go function and type names (func, type) or Python definitions (def, class). If you’re editing a specific function, only that function’s facts get loaded — not everything known about the file.

This targeting matters. A file with thirty exported functions might have thirty symbol-level facts. Loading all of them when you’re editing one function wastes context. The symbol inference narrows the injection to just the facts about the code you’re actually changing.

SessionEnd: Record Activity and Upload Transcript

When a session ends, three things happen. The hook stores a lightweight activity fact — “last active in project X at time Y” — so future sessions know when you were last working here. It prints open tasks to stderr as a departing reminder, visible in the terminal but not injected into the conversation.

The third piece is transcript upload. The hook sends the session transcript to memstored for storage and later analysis. This was originally in the MCP server’s shutdown path, which seemed like the natural place for cleanup work. The problem is that MCP server processes get killed without graceful shutdown — the parent process sends a signal and moves on. The transcript upload never completed. Moving it to the SessionEnd hook, which runs in its own process with a 10-second timeout, fixed the reliability issue. Small transcripts go inline in the HTTP request; large ones get uploaded via a detached background process that outlives the hook.

Triggers: The Warm Tier

Triggers are the mechanism that makes warm context work. A trigger is a fact with kind=trigger that encodes a file-pattern-to-context mapping: “when the agent touches a file matching **/sql*.go, load the SQLite subsystem’s invariants and conventions.”

The eval-triggers CLI evaluates glob patterns against file paths, supporting *, ?, and ** with suffix matching against absolute paths. When a pattern matches, it loads the associated subsystem’s facts, filtered by kind — invariants, conventions, and failure modes. These are the facts that represent cross-cutting constraints: things that aren’t visible in any single file but that you need to know before making changes.

The distinction from the Read/Edit hooks’ file-level facts is scope. File-level facts describe what’s in the file. Trigger-loaded facts describe what the file participates in — the broader system constraints that govern how it should be changed. Both are useful. They serve different purposes.

What Works and What Doesn’t

Works Well

Startup task surfacing genuinely solves the session-continuity problem. Before this, every new session required either a manual status update or hoping the model would search for pending work. Now it’s automatic. I open a session, and it already knows what’s in flight.

Server-side recall with IDF extraction produces noticeably better results than raw prompt search. The combination of distinctive keyword selection, multi-keyword boosting, and session dedup means the context that gets injected is relevant and non-repetitive. The operational lessons post covers the scoring details.

Trigger-based invariant loading is the single highest-value hook behavior. The invariants that triggers surface are exactly the kind of knowledge that prevents bugs — constraints that span multiple files and aren’t documented in any one place. Loading them automatically before an edit is worth more than any amount of post-hoc review.

Doesn’t Work Yet

Learned facts from local models are the weakest link. The codebase learner runs source through a local Ollama model, and the resulting symbol descriptions read like GoDoc summaries — technically accurate but low-signal. They describe what a function does, not why it matters or what constraints govern it. The operational lessons post describes how symbol demotion in the scoring pipeline mitigates this, but the underlying quality issue remains.

The Read hook on large codebases can inject a wall of symbol facts when you open a file with many exported functions. Without curation — marking some facts as more important than others — the hook doesn’t know which symbols matter for the current task. It loads everything and hopes the budget cap keeps it reasonable.

There’s no per-file or per-session suppression. If a hook is injecting context you don’t need for a particular task, you can’t tell it to stop for just this session. The options are global — disable the hook entirely or adjust the scoring — which is too coarse for “I know about the SQLite invariants, stop telling me.”

Setup

The hooks are registered in .claude/settings.local.json as an array of hook definitions. Each entry specifies the event type, an optional tool name matcher for PreToolUse/PostToolUse events, and the shell command to run. The command receives JSON on stdin and must return JSON on stdout with at minimum a result field and optionally additionalContext.

The memstore repository includes example hook scripts. The basic pattern is a shell script that parses the incoming JSON with jq, calls the appropriate memstore CLI command or curl to the memstored HTTP API, and formats the response as JSON with the context string in additionalContext.

What’s Next

Curator integration. The biggest open problem is fact quality at injection time. A small local model running as a curator could filter and rank facts before they’re injected — taking the candidate set from the recall pipeline and deciding which ones are actually relevant to the current task. This would address the Read hook’s wall-of-symbols problem and the generally low quality of auto-generated descriptions.

Trigger auto-generation during codebase ingestion. Right now triggers are manually authored. When the codebase learner walks a repository, it could analyze file dependencies and cross-references to propose trigger patterns automatically — “these five files share a database dependency, here’s a trigger that loads the schema constraints when any of them are touched.”

Write hook for new file creation. The current hooks cover reading and editing existing files but not creating new ones. A PreToolUse:Write hook could load project conventions and templates before the model generates a new file, ensuring it follows established patterns from the start.

Negative feedback signal. The rate_context feedback loop is closed for positive signal — helpful facts rise in the rankings. But there’s no way for the agent to mark an injected fact as actively unhelpful during a session, which would let the recall pipeline deprioritize it immediately rather than waiting for the gradual scoring adjustment.

Project: github.com/matthewjhunter/memstore

Certifications

Publications