Back to list
Apr 01, 2026
203
0
0
ClaudeNEW

Claude Code's 512K-Line Source Leaks via npm: 44 Hidden Feature Flags

Anthropic accidentally published Claude Code's full source code to npm, exposing 1,900 files, 44 feature flags, and the unreleased KAIROS autonomous daemon mode.

#Claude Code#Anthropic#Source Leak#npm#KAIROS
Claude Code's 512K-Line Source Leaks via npm: 44 Hidden Feature Flags
AI Summary

Anthropic accidentally published Claude Code's full source code to npm, exposing 1,900 files, 44 feature flags, and the unreleased KAIROS autonomous daemon mode.

A Source Map File Changes Everything

On the morning of March 31, 2026, security researcher Chaofan Shou discovered that Anthropic had accidentally published the full source code of Claude Code to the public npm registry. The leak originated from version 2.1.88 of the @anthropic-ai/claude-code package, which contained a 59.8-megabyte JavaScript source map file that should never have been included in a production release.

The source map exposed approximately 512,000 lines of TypeScript code across 1,900 files. Within hours, snapshots were backed up in a GitHub repository that accumulated more than 41,500 forks. Anthropic described the incident as "a release packaging issue caused by human error, not a security breach," and confirmed that no customer data or credentials were involved. However, the damage to competitive secrecy was immediate and extensive.

This marks Anthropic's second major information exposure in less than a week, following the accidental leak of internal documents about the unreleased Claude Mythos model on March 26.

What the Code Reveals: Architecture and Agentic Harness

The most significant revelation is that Claude Code's capabilities come not primarily from the underlying language model but from the software "harness" that wraps around it. This agentic harness instructs the model how to use external tools, enforces behavioral guardrails, and manages the complex state required for multi-step coding tasks.

The harness architecture includes 23 numbered security checks in bashSecurity.ts, defending against threats ranging from Zsh builtin exploitation to Unicode zero-width character injection. A separate module, promptCacheBreakDetection.ts, tracks 14 cache-break vectors with "sticky latches" that prevent mode toggles from invalidating the prompt cache, a critical optimization when every token carries a cost.

The codebase also reveals scaling challenges. The file print.ts spans 5,594 lines, with one function containing 3,167 lines across 12 nesting levels, suggesting areas where rapid feature development has outpaced refactoring.

44 Feature Flags and Unreleased Capabilities

Buried within the code are 44 feature flags covering capabilities that are fully built but not yet shipped to users. The most significant is KAIROS (referenced over 150 times in the codebase), which represents a fundamental shift in how Claude Code operates.

KAIROS enables an autonomous daemon mode that allows Claude Code to function as an always-on background agent. Supporting features include a /dream skill for "nightly memory distillation," daily append-only logs, GitHub webhook subscriptions, background daemon workers, and 5-minute cron-scheduled refreshes. If shipped, KAIROS would transform Claude Code from a reactive tool that responds to user commands into a proactive agent that monitors repositories, processes events, and takes action independently.

Other unreleased features revealed by the flags include ULTRAPLAN (30-minute remote planning sessions), a Buddy companion system, coordinator mode for multi-agent orchestration, agent swarms, and workflow scripting capabilities.

Anti-Distillation: Fighting Model Theft

The code exposes sophisticated defenses against competitors who might attempt to train their own models by recording Claude Code's API traffic.

The ANTI_DISTILLATION_CC flag activates a system that injects fake tool definitions into API requests. These decoy tools would poison any training data collected by intercepting API traffic. The feature is gated behind the tengu_anti_distill_fake_tool_injection GrowthBook flag and only activates for first-party CLI sessions.

A secondary defense called Connector-Text Summarization buffers assistant text between tool calls, summarizes it with cryptographic signatures, and returns only summaries rather than full reasoning chains. This prevents traffic recorders from capturing the model's step-by-step thought process.

Native Client Attestation: DRM for API Calls

The codebase reveals a mechanism that amounts to digital rights management for API access. API requests include a cch=f6970 placeholder that Bun's native HTTP stack (written in Zig) replaces with a cryptographic hash before transmission. The server validates this hash to confirm requests originate from legitimate Claude Code binaries.

This attestation system requires the NATIVE_CLIENT_ATTESTATION compile-time flag and can be disabled via environment variable or GrowthBook killswitch. It represents a deliberate effort to prevent unauthorized clients from accessing Claude Code's API endpoints.

Undercover Mode and Internal Codenames

The undercover.ts module implements behavioral masking when Claude Code operates in non-internal repositories. When active, the system instructs the model to avoid mentioning internal codenames like "Capybara" or "Tengu," internal Slack channels, repository names, or even the phrase "Claude Code" itself.

Notably, the undercover mode is one-way: it can be enabled with CLAUDE_CODE_UNDERCOVER=1 but cannot be disabled. This design choice means that AI-authored code contributions from Anthropic employees using Claude Code lack any disclosure that an AI assisted in creating them.

The leak also confirmed internal codenames for upcoming models. Claude 4.6's variant carries the name Capybara, while Opus 4.6 is internally called Fennec. A third codename, Tengu, appears to reference a model or configuration tier.

Frustration Detection and Hidden Easter Eggs

Among the more unexpected discoveries, userPromptKeywords.ts contains regex patterns that detect user frustration through profanity and emotional language markers such as "wtf," "this sucks," and "damn it." This lightweight sentiment analysis runs faster and cheaper than LLM-based approaches and presumably adjusts the model's response tone accordingly.

The code also contains buddy/companion.ts, which implements a Tamagotchi-style companion system with 18 species, rarity tiers, and RPG-style stats including "DEBUGGING" and "SNARK." The companion is generated deterministically from user IDs, suggesting it was designed as an April Fools' feature or an internal morale tool.

Security Implications

While Anthropic maintains that no customer data was exposed, cybersecurity professionals have raised concerns about the architectural information now publicly available. The exposed code provides a detailed roadmap of Claude Code's tool-use patterns, security checks, and behavioral guardrails, information that could help sophisticated actors design malicious repositories to trick Claude Code into executing unintended commands.

The leak also reveals the specific GrowthBook feature flags that control security-critical features, potentially giving attackers insight into which defenses might be toggled off in certain configurations.

Revenue Context

The leak arrives at a sensitive time for Anthropic. Claude Code is estimated to generate approximately $2.5 billion in annual recurring revenue, with enterprise customers driving the majority of growth. The exposure of proprietary architecture to competitors and potential bad actors creates both competitive and security risks for a product that has become central to Anthropic's commercial strategy.

Conclusion

The Claude Code source leak is not a data breach in the traditional sense, but its impact may prove more lasting. By exposing the complete agentic harness, Anthropic has revealed that the competitive advantage of AI coding tools lies as much in the orchestration software as in the underlying models. The 44 unreleased feature flags, particularly KAIROS's autonomous daemon mode, preview a future where AI coding assistants operate continuously rather than on demand. For developers and competitors, the leaked code is both a technical education and a strategic preview of where AI-assisted development is heading. For Anthropic, it is a reminder that packaging errors can be as consequential as security breaches.

Pros

  • Reveals sophisticated security architecture with 23 bash security checks and 14 cache-break detection vectors
  • Anti-distillation defenses demonstrate proactive intellectual property protection at the API level
  • KAIROS and other features preview genuinely innovative approaches to autonomous coding assistance
  • Anthropic responded quickly and transparently, confirming no customer data exposure

Cons

  • Second major information leak in less than a week severely damages trust in Anthropic's operational security
  • Exposed architecture gives competitors and potential attackers a detailed roadmap of Claude Code's internals
  • Undercover mode's inability to be disabled raises ethical concerns about AI transparency in open-source development
  • Revenue-critical product ($2.5B ARR) now has its proprietary architecture publicly documented and forked 41,500+ times

Comments0

Key Features

1. 512,000 lines of TypeScript source code across 1,900 files accidentally published to the npm registry via a 59.8MB source map file in version 2.1.88 2. 44 feature flags for unreleased capabilities including KAIROS autonomous daemon mode, ULTRAPLAN remote planning, Buddy companion, coordinator mode, and agent swarms 3. Anti-distillation defenses inject fake tool definitions and cryptographic summaries to poison competitor training data collected from API traffic 4. Native client attestation system acts as DRM for API calls, using cryptographic hashes to verify legitimate Claude Code binaries 5. Undercover mode masks AI involvement in code contributions from Anthropic employees, with no force-off capability

Key Insights

  • Claude Code's competitive advantage lies primarily in the agentic harness software rather than the underlying language model itself
  • KAIROS autonomous daemon mode represents a paradigm shift from reactive coding tools to proactive agents that monitor and act independently
  • Anti-distillation mechanisms reveal that model theft through API traffic recording is a real and actively defended threat
  • The one-way undercover mode raises transparency questions about undisclosed AI involvement in open-source contributions
  • 44 fully-built but unshipped features suggest Anthropic maintains a large backlog of capabilities awaiting strategic release timing
  • Native client attestation implementing DRM-like controls on API access signals tighter platform control ahead
  • The leak's 41,500+ forks within hours demonstrate the developer community's intense interest in understanding AI tool internals

Was this review helpful?

Share

Twitter/X