On March 24, 2026, the LiteLLM PyPI package was compromised. Versions 1.82.7 and 1.82.8, published by an account labeled TeamPCP, contained malicious code. I had LiteLLM running in my homelab as a routing layer between local AI clients and several model backends. This post is the postmortem: what I was running, what the exposure actually was, why I removed LiteLLM rather than just upgrading, and what the incident clarified about supply chain risk in homelab AI infrastructure.

The short version: my GHCR Docker image was likely unaffected, but “likely” was not the standard I wanted to be operating at, and LiteLLM had stopped being load-bearing once I had a clearer architectural direction. The right response was removal, not patching.

What I Was Running

  • LiteLLM container on docker-web (192.168.2.10:4000) acting as the OpenAI-compatible API gateway for several internal AI clients.
  • Routing rules sending traffic to local Ollama instances (Strix Halo workstation, rainbow) and a few cloud providers.
  • Per-client API keys with model allowlists — Herald used a key restricted to gemma3:12b and an embedding model, for example.
  • Source: a pinned GHCR Docker image, not a fresh PyPI install.

The Exposure Question

Three layers had to be checked:

  1. Was my image affected? The GHCR image was built before the compromised PyPI versions shipped. The version pinning meant my running container could not have pulled the bad code via routine update.
  2. Could it have been re-pulled? Image digest was pinned in the deployment. A fresh pull would not have substituted a different image silently.
  3. Was there a build-time path I’d missed? The image build process did not invoke pip install litellm against the live PyPI index — the install was against a known-good version baked into the base image.

Conclusion: probably not exposed. But “probably” was a hedge against gaps I didn’t know about.

Why “Probably Not Exposed” Was the Wrong Standard

The standard I run my homelab to is the standard I run production security to: if a control plane component is compromised in the wild, you assume you’re affected until you’ve proven you aren’t, and you shorten time-to-patch even when you believe you’re clean.

For LiteLLM specifically, a few things made “patch and continue” unattractive:

  • Routing-layer position. LiteLLM saw every prompt and every response from every internal AI client. If it had been compromised, the blast radius would have been everything that ran through it.
  • API key storage. Cloud provider keys were configured in LiteLLM. A compromise would have leaked them.
  • Operational complexity. Maintaining a routing layer that routinely needed surgery (per-key model allowlists, broken upstream features, model name mismatches) had been costing more than it was returning.
  • Redundancy. Olla covered local inference routing, and direct connections to Ollama via SSH tunnel were already working for the most demanding consumer (Herald, via autossh tunnel from docker-web to rainbow’s Ollama).

LiteLLM had been the central hub. By the time of the incident, it was a hub through which less and less actually needed to flow. That’s the easiest kind of removal decision.

Removal Mechanics

  • Cataloged every consumer of the LiteLLM endpoint.
  • For each: switched to direct Ollama (via SSH tunnel where networks didn’t directly route), or switched to Olla, or migrated the cloud-provider call to the relevant provider SDK with its own key.
  • Rotated every API key that had been stored in LiteLLM’s config, regardless of compromise probability.
  • Tore down the LiteLLM container.
  • Removed the LiteLLM ingress route and the autossh service that had supported it.

What This Clarified About Supply Chain Risk in Homelab AI

A few takeaways I didn’t have framed this clearly before the incident:

  1. Pinning helps less than you think when the package itself is the threat surface. Pinning 1.82.6 vs 1.82.7 is fine if the latter is broken; it doesn’t help if next week’s release is also compromised and you upgrade to it routinely.
  2. AI infrastructure is high-leverage. Compromise of a routing layer or an embedding model affects every downstream consumer — and the consumers (LLMs) won’t tell you when their behavior changes subtly.
  3. “Used by everything” is a reason to remove, not to keep. Centrality is supply-chain risk concentration. If a component is touching every prompt and every response, you want it to be the thing you’re most willing to throw away.
  4. The compensating control is observability, not just version pinning. Logging what a routing layer actually did during the suspected compromise window is more useful than knowing what version it was.

Things I’m Still Doing Differently

  • Treating every AI infrastructure component as a candidate for “could I delete this and route around it?” before reaching for “could I patch and continue?”
  • Direct-to-backend connections wherever the routing layer wasn’t earning its keep.
  • Per-component supply chain notes: who maintains this, what’s the historical trust level, what would the blast radius be.

Closing

Most homelab postmortems are about hardware or misconfiguration. This one is about a software supply chain attack against an AI infrastructure component, in 2026, when the AI tooling ecosystem is moving fast enough that a lot of what’s load-bearing today is younger than the most recent CVE in your operating system. The hygiene that was sufficient for a stable Linux server is not sufficient for an AI routing layer. The discipline to remove rather than patch is going to come up again.