Matthew Hunter

The LiteLLM Supply Chain Attack: A Homelab Postmortem

By Matthew Hunter | May 15, 2026 | ai, security, supply-chain, litellm, homelab, postmortem

On March 24, 2026, the LiteLLM PyPI package was compromised. Versions 1.82.7 and 1.82.8, published by an account labeled TeamPCP, contained malicious code. I had LiteLLM running in my homelab as a routing layer between local AI clients and several model backends. This post is the postmortem: what I was running, what the exposure actually was, why I removed LiteLLM rather than just upgrading, and what the incident clarified about supply chain risk in homelab AI infrastructure.

Defense in Depth for AI Agents

By Matthew Hunter | May 12, 2026 | ai, security, prompt-injection, mcp, architecture

The security conversation around AI agents has mostly focused on two things: keeping agents from hurting the host system, and keeping malicious tools out of the supply chain. These are real problems. Cisco documented how OpenClaw leaks credentials and executes arbitrary shell commands. Projects like NanoClaw respond by running agents in containers where bash commands can’t reach the host. Zencoder’s MCP survival guide catalogs supply chain attacks against MCP servers and recommends pinning git tags and auditing source.

Threat Modeling a Persistent Memory Store for AI Agents

By Matthew Hunter | May 12, 2026 | ai, security, memstore, mcp, threat-modeling

Persistent memory for AI agents solves a real problem (the goldfish-with-a-PhD problem) but it introduces a new one: a high-trust, cross-session, cross-agent data store sitting inside the LLM’s context loop. Every recall is content that flows into a prompt. Every store is content that came from somewhere — sometimes the user, sometimes the model, sometimes a tool result that originated externally.

That’s a threat model worth writing down before the data store grows up. This post is the threat model for memstore — the persistent memory system I built for Claude Code — and the controls I’m applying or planning.

Self-Improving Recall: A Feedback Loop for AI Memory

By Matthew Hunter | May 11, 2026 | memstore, claude-code, ranking

A memory system that ranks facts the same way forever is dead weight. The signal that actually matters — did this fact help, or did it waste context — only exists during a real conversation. Memstore’s feedback loop captures that signal in-session and feeds it back into recall ranking, so the system gets better at surfacing useful knowledge the more it’s used.

The problem: static recall is stale recall

Memstore’s baseline ranking uses static signals — IDF, project boosts, semantic similarity, recency, surface-aware multipliers for project-level facts. All of them are derived from the fact itself, the query, or the static metadata around them. None answers the real question: when memstore injected this fact last time, did it help the agent or did it just crowd out something better?

math-mcp: Giving LLMs a Calculator That Knows It's a Calculator

By Matthew Hunter | May 10, 2026 | mcp, claude-code, golang, math, finance, statistics

LLMs do arithmetic in their heads. Mostly they’re close. Occasionally they’re off by enough to matter — a mortgage payment that’s $90 too high, an opportunity-cost claim that’s understated by 10%, a loan payoff term that ignores the interest still accruing while you pay it down. The model doesn’t know which case it’s in. Neither do you, unless you check.

math-mcp is a small MCP server that gives the model somewhere else to send those questions. It exposes ~55 discrete tools: Go’s math standard library, gonum’s statistical aggregates, and razorpay/go-financial’s time-value-of-money functions, each wrapped as its own tool so the LLM picks them out of its tool list by name and calls them directly. One tool per function, no expression evaluator. The point is to make “I should not estimate this” the easy path.

HUMOR: 35%

By Matthew Hunter | Mar 27, 2026 | claude-code, humor

In Interstellar, TARS has a configurable humor setting. Cooper runs it at 75%. It’s played for laughs, but it’s also one of the more honest things in the film: personality as a dial, mood as a parameter.

I gave Claude Code the same feature.

My AI coding assistant runs with a UserPromptSubmit hook — a small script that fires on every prompt and injects context into the conversation. It was already injecting the current date and time. I decided to also inject a humor percentage, dynamically computed from the hour of day:

Operational Lessons from 1,500 Memstore Facts

By Matthew Hunter | Mar 26, 2026 | memstore, mcp, claude-code, information-retrieval

Building a persistent memory system for AI agents is one problem. Operating it at scale is a different one. After two months of daily use, memstore holds around 1,500 active facts across a dozen projects — project architecture, coding conventions, design decisions, cross-cutting invariants, and roughly a thousand symbol-level descriptions generated by running an LLM over every Go function in every codebase I work on. The system works. The recall pipeline surfaces relevant context on every prompt without manual search. But getting from “works” to “works well” required a series of scoring adjustments, noise filters, and feedback mechanisms that the original design didn’t anticipate.

Fact Supersession: Version Control for Knowledge

By Matthew Hunter | Mar 25, 2026 | memstore, knowledge-management

Most memory systems for AI agents treat knowledge as a key-value store. Write a fact, overwrite it later, old value is gone. That works for simple preferences — “use dark mode” doesn’t need a paper trail. But knowledge that evolves over time is a different problem. When a design decision turns out to be wrong, or a project’s architecture shifts, or a dependency gets replaced, you don’t just want the current answer. You want to know what you believed before, when it changed, and ideally why. Losing that history means losing the reasoning trail, and reasoning is the expensive part. Memstore’s supersession system brings version control semantics to AI memory: facts get replaced, not erased, and the full chain of revisions is preserved.

Proactive Context Injection with Claude Code Hooks

By Matthew Hunter | Mar 24, 2026 | claude-code, mcp, memstore

Claude Code sessions start with amnesia. Every conversation begins cold — no memory of what you worked on yesterday, what invariants matter in the file you’re about to edit, or what tasks are still pending from last week. CLAUDE.md helps by injecting static project context, and MCP tools let the model search for facts on demand. But both of those require something to go right first: either the static file happens to cover the relevant topic, or the model decides to search before acting. In practice, the model often doesn’t search. It plows ahead with what it has, and the most valuable context — the constraint you documented last Tuesday, the task you left half-finished — stays in the database unsurfaced. This post is about closing that gap with hooks: small scripts that fire at specific points in the Claude Code lifecycle and inject relevant context automatically, before the model even knows it needs it.

Building Persistent Memory for AI Agents

By Matthew Hunter | Mar 23, 2026 | memstore, mcp, claude-code, sqlite

AI coding assistants are goldfish with a PhD. They can solve complex problems within a single session — refactoring a module, debugging a race condition, designing an API — but the moment the conversation ends, everything they learned about your project evaporates. After months of building software with Claude Code, I found myself re-explaining the same project conventions, the same architectural decisions, the same mistakes we’d already caught and fixed, at the start of every session. CLAUDE.md files help, but they’re static and they don’t scale. You can’t stuff a dozen projects’ worth of design context into a single markdown file. So I built memstore: a persistent memory system that gives AI agents durable, searchable knowledge across sessions, projects, and machines.

Computing

Certifications

Publications

The LiteLLM Supply Chain Attack: A Homelab Postmortem

Defense in Depth for AI Agents

Threat Modeling a Persistent Memory Store for AI Agents

Self-Improving Recall: A Feedback Loop for AI Memory

The problem: static recall is stale recall

math-mcp: Giving LLMs a Calculator That Knows It's a Calculator

HUMOR: 35%

Operational Lessons from 1,500 Memstore Facts

Fact Supersession: Version Control for Knowledge

Proactive Context Injection with Claude Code Hooks

Building Persistent Memory for AI Agents

Newsletter

Recent Posts

categories

tags

About

Certifications

Publications

Navigation

Newsletter

Recent Posts

categories

tags