Matchlock: The MicroVM Sandbox That Keeps Your Secrets Out of AI Agents’ Reach
What happens when you give an AI agent root access to run arbitrary code? If you’re lucky, it installs packages and writes files. If you’re unlucky, your API keys end up somewhere they shouldn’t be. Matchlock offers a third option: give agents a full Linux environment while keeping your secrets firmly on the host.
The Core Insight
Matchlock takes a radically different approach to AI agent sandboxing. Instead of trying to filter what an agent can do (allowlists, capability restrictions, output scanning), it isolates the entire execution environment at the VM level—then uses a transparent proxy to inject secrets in flight at the network layer.
Here’s the elegant part: the sandbox never sees your real credentials. When your agent makes an API call, it uses placeholder environment variables. The host-side proxy intercepts the request and swaps in the actual secret before it hits the wire. Even if the agent is tricked into exfiltrating its environment variables, attackers get worthless placeholders.
# The agent only ever sees: SANDBOX_SECRET_a1b2c3d4...
# But the API call goes out with: sk-real-anthropic-key-here
matchlock run --image python:3.12-alpine \
--secret ANTHROPIC_API_KEY@api.anthropic.com python agent.py
The architecture combines:
– Firecracker microVMs (Linux) or Virtualization.framework (macOS)
– Network allowlisting via nftables/gVisor
– Copy-on-write filesystems for instant cleanup
– MITM proxy for credential injection
– Sub-second boot times for practical iterative use
Why This Matters
AI agents are becoming more capable—and more dangerous. Modern coding agents can install packages, modify files, make HTTP requests, and chain complex actions. The traditional security model of “just don’t give it network access” breaks down when the agent needs to call APIs to be useful.
Matchlock solves the credential leakage problem that keeps security teams up at night. Consider the attack surface:
- Prompt injection tricks an agent into running malicious code
- That code tries to exfiltrate environment variables
- With Matchlock, it gets useless placeholders instead of real secrets
The network allowlisting adds defense in depth. Even if an agent somehow obtained real credentials, --allow-host ensures traffic can only reach explicitly permitted domains. There’s nowhere to exfiltrate data to.
What makes this practical is the developer experience. One-second boot times mean you’re not waiting around for VMs to spin up. The CLI is clean (matchlock run, matchlock exec, matchlock kill), and SDKs exist for both Go and Python if you want to embed sandboxing in your own tooling.
Key Takeaways
- Secrets never enter the VM: Credentials are injected at the network layer by host-side proxy
- Sub-second boot times: Firecracker microVMs on Linux, Virtualization.framework on Apple Silicon
- Network allowlisting: Block all traffic except to explicitly permitted hosts
- Copy-on-write cleanup: Every sandbox gets a fresh filesystem that vanishes on exit
- Cross-platform: Same CLI works on Linux servers and MacBooks
- SDK available: Go and Python libraries for programmatic sandbox control
- Install via Homebrew:
brew install matchlock
Looking Ahead
As AI agents become more autonomous—and agentic workflows chain multiple tools together—the security implications compound. Matchlock represents a shift from “trust but verify” to “isolate and inject.”
The interesting question is how this pattern scales to multi-agent systems. When Agent A spawns Agent B which calls Agent C, who manages the credential injection? How do you maintain audit trails across VM boundaries?
For now, Matchlock offers a practical answer to the immediate problem: run untrusted AI code with confidence that your secrets stay safe, even when the agent itself can’t be trusted.
Based on analysis of GitHub – jingkaihe/matchlock: Matchlock secures AI agent workloads with a Linux-based sandbox