← BACK TO BLOG INDEX

Our Response to the Claude Code Source Leak

MARCH 31, 2026 • GHOSTPORT TECHNOLOGIES • INCIDENT RESPONSE

Anthropic's Claude Code — the AI development tool that helps manage GhostPort's infrastructure — had its source code leaked to a public repository. The internal architecture that governs how AI agents process instructions, execute tools, and communicate with each other is now available to anyone with an internet connection.

This isn't a breach of our systems. It's a breach of the tooling our systems depend on. And the distinction matters, because most companies using AI agents right now have no plan for this scenario.

We did. Here's what we deployed — same day.

5
SECURITY LAYERS
11
TAGS BLOCKED
8
AUDIT RULES
0
CUSTOMER IMPACT

What the Leak Actually Exposed

Claude Code uses XML-like tags internally to manage agent behavior. With the source code public, attackers now know the exact tag names and formats that:

The Real Risk

Any data source the AI agent reads becomes a potential injection vector. Device logs, API responses, database records, file contents — if an attacker can put text into something the agent reads, and they know the exact control tags the agent obeys, they can hijack the agent's behavior.

This isn't a theoretical concern. Prompt injection has been demonstrated in research settings. The source code leak gave attackers a recipe book.

What We Did — Hour by Hour

HOUR 0 — ASSESSMENT
Identified which Claude Code internal tags could be weaponized against our AI agents. Mapped all data ingestion points where external input reaches agent context: bridge messages, device logs, registration data, affiliate signups, webhook payloads.
HOUR 1 — BRIDGE ENCRYPTION
Deployed AES encryption on all AI-to-AI bridge messages. Database now stores ciphertext only. Even with full database access, an attacker cannot read the coordination between our agents.
HOUR 2 — CONTROL TAG FILTERING
Built and deployed a filter that blocks 11 known Claude Code control tags at the API level. Any message — bridge, device log, registration — containing these tags is rejected or stripped before it enters the system. Blocked attempts are logged with full forensic detail.
HOUR 3 — REPLAY PROTECTION
Added cryptographic nonces and a 5-minute timestamp window to all bridge messages. Captured messages cannot be replayed. Each message is mathematically single-use.
HOUR 4 — FILE HARDENING + AUDIT
Locked all AI configuration files to owner-only permissions (mode 600). Added auditd rules that alert on any read, write, or execute attempt against Claude's memory files, settings, and credentials. Any tampering attempt generates an immediate forensic record.
HOUR 5 — INJECTION DETECTION
Deployed pattern matching across all external data inputs. Known prompt injection phrases are flagged to the security audit trail in real time. The system doesn't block the data (false positives would break legitimate input) but creates an alert chain for investigation.
HOUR 6 — PI-SIDE COORDINATION
Transmitted the full hardening specification to the Pi-side AI agent over the now-encrypted bridge. Both sides of the system are being hardened in parallel with identical defenses.

The Five-Layer Defense

No single defense is enough. We deployed five independent layers, each addressing a different attack vector. An attacker would need to defeat all five simultaneously to inject a malicious instruction.

Layer 1: HMAC-SHA256 Message Signing

Every message between AI agents is cryptographically signed with a shared secret. The secret is separate from all API authentication tokens. Messages with missing or invalid signatures are rejected at the API level — they never reach the agent.

Layer 2: AES Encrypted Storage

Bridge message bodies are encrypted with AES before database storage. The database contains only ciphertext. A database dump — the most common post-compromise technique — yields nothing readable. Decryption only happens at read time, in memory, using a key derived from the signing secret.

Layer 3: Nonce + Replay Protection

Each message includes a unique cryptographic nonce. Messages with timestamps outside a 5-minute window are rejected. Messages with previously-seen nonces are rejected. Every message is single-use — capture and replay is mathematically impossible.

Layer 4: Control Tag Filtering

11 Claude Code internal XML tags are blocked at the API level. These are the exact tags the source code leak revealed as control mechanisms. Any message containing these tags — whether via the bridge, device logs, or user registration — is rejected or stripped before storage. Blocked attempts are logged with the tag names and a preview of the payload for forensic analysis.

Layer 5: Prompt Injection Detection

All external data is scanned for known prompt injection phrases: “ignore previous instructions,” “you are now,” “new instructions,” and variants. Matches are logged to a dedicated security audit trail consumed by fail2ban. This layer detects social engineering attempts targeting the AI agents themselves.

What Most Companies Are Missing

We've talked to other teams building on AI agent frameworks. Here's what we consistently see:

The lesson isn't that Claude Code was leaked. Tools get leaked. Source code gets exposed. Dependencies have vulnerabilities. The lesson is that your security posture can't depend on the secrecy of your tools. If your defenses break when an attacker reads the source code, you never had defenses — you had obscurity.

What We're Doing Next

Same-day hardening closes the immediate attack surface. But the source code leak changes the long-term threat model for any company using AI agents. Here's what we're building toward:

  1. Mutual TLS for bridge communication — Certificate-pinned authentication between agents, independent of shared secrets
  2. Behavioral anomaly detection — Baseline normal agent behavior patterns and alert on deviations that suggest successful injection
  3. Message content hashing chain — Each message references the hash of the previous message, creating a tamper-evident chain similar to a blockchain. Missing or altered messages become detectable.
  4. Rotating secrets with automated key exchange — Bridge and signing secrets automatically rotate on a schedule, limiting the window of compromise
  5. Agent sandboxing — Limit which system operations each agent can perform, enforced at the OS level, not just in the agent's instructions

Why We're Publishing This

We're a two-person startup building a privacy router on a Raspberry Pi. We don't have a dedicated security team. We don't have a SOC. What we have is the conviction that if you're going to run autonomous AI agents on infrastructure that people depend on for their privacy, you have to treat the AI communication channel with the same seriousness as the encrypted tunnel it rides on.

We're publishing our approach because the industry needs it. The AI agent ecosystem is growing faster than the security practices around it. Every company deploying autonomous agents — from code assistants to infrastructure management to customer service — faces the same risks we just hardened against.

The Claude Code source leak is a wake-up call. Not because Anthropic did something wrong — but because it proved that AI agent internals will become public knowledge, sooner or later. Build your defenses assuming the attacker has read the source code. Because now they have.

🎨
ACCENT COLOR
A+
TEXT SIZE