What exactly is the AI Agent doing? A full analysis of Claude Code's 500,000 lines of leaked code

Question

51.2 million lines of code, 1,906 files, 59.8 MB of source maps. In the early hours of March 31, Chaofan Shou from Solayer Labs discovered that Anthropic’s flagship product, Claude Code, had exposed the full source code in a public npm repository. Within a few hours, the code was mirrored to GitHub, and the number of forks surpassed 41,000.

This isn’t the first time Anthropic has made this mistake. When Claude Code was first released in February 2025, the same source map leakage happened once as well. This version is v2.1.88; the cause of the leak was the same—Bun, the build tool, generates source maps by default, and the .npmignore file forgot to include this file.

Most coverage has focused on the easter eggs in the leak—like a virtual pet system and a “undercover mode” that lets Claude anonymously submit code to open-source projects. But the real question worth unpacking is: why does the same Claude model behave so differently in the web version versus Claude Code? What exactly is the 512,000 lines of code doing?

The model is just the tip of the iceberg

The answer is hidden in the code structure. According to a reverse-engineering analysis of the leaked source code by the GitHub community, out of 512,000 lines of TypeScript, only about 8,000 lines are directly responsible for calling the AI model interface code—roughly 1.6% of the total.

So what is the remaining 98.4% doing? The two biggest modules are the query engine (46,000 lines) and the tools system (29,000 lines). The query engine handles LLM API calls, streaming output, cache orchestration, and multi-turn conversation management. The tools system defines about 40 built-in tools and 50 slash commands, forming a plugin-like architecture where each tool has its own independent permission controls.

In addition, there are 25,000 lines of terminal UI rendering code (including a file called print.ts that is 5,594 lines long, with a single function spanning 3,167 lines), 20,000 lines of security and permission control (including 23 numbered Bash security checks and 18 blocked Zsh builtin commands), and 18,000 lines of a multi-agent orchestration system.

After analyzing the leaked code, machine learning researcher Sebastian Raschka pointed out that the reason Claude Code is stronger than the web version with the same model isn’t the model itself—it’s the software scaffolding built around the model, including repository context loading, dedicated tool dispatching, caching strategies, and sub-agent collaboration. He even believes that if you apply the same engineering architecture to other models like DeepSeek or Kimi, you could also get a programming performance improvement that’s close to what Claude Code achieves.

A direct comparison can help make the gap clear. When you ask a question in ChatGPT or the Claude web app, the model processes it and returns an answer, and when the conversation ends, nothing is kept. But Claude Code does it completely differently: when it starts, it first reads your project files, understands your codebase structure, and remembers preferences you previously said, like “don’t mock the database in tests.” It can execute commands directly in your terminal, edit files, and run tests. When it encounters complex tasks, it breaks them into multiple subtasks and assigns them to different sub-agents to work in parallel. In other words, the web AI is a Q&A window, while Claude Code is a collaborator living on your computer.

Someone has compared this architecture to an operating system: the 42 built-in tools are like system calls, the permission system is like user management, the MCP protocol is like device drivers, and sub-agent orchestration is like process scheduling. When tools ship, they’re default-marked as “unsafe, writable” unless developers explicitly declare them safe. The file-editing tool enforces that it must check whether you’ve read that file first—if you haven’t, it won’t let you change it. This isn’t just a chatbot with a few tools tacked on; it’s a full runtime environment with LLM at the core and complete safety mechanisms.

This means one thing: the competitive barrier for AI products may not be at the model layer, but at the engineering layer.

Every cache miss multiplies costs by 10

Among the leaked code, there’s a file called promptCacheBreakDetection.ts that tracks 14 possible vectors that can cause prompt cache invalidation. Why would Anthropic engineers spend so much effort preventing cache misses?

Take a look at Anthropic’s official pricing. For example, for Claude Opus 4.6, the standard input price is $5 per million tokens, but if it hits the cache, the read price drops to just $0.50—90% cheaper. Conversely, every time there’s a cache miss, inference costs have to be multiplied by 10.

This explains the many seemingly “over-engineered” architectural decisions in the leaked code. When Claude Code starts, it loads the current git branch, the most recent commit history records, and the CLAUDE.md file as context. These static contents are cached globally; dynamic content is separated using boundary markers to ensure that each conversation doesn’t repeatedly process already-existing context. The code also includes a mechanism called sticky latches that prevents mode switching from destroying the caches that have already been established. Sub-agents are designed to reuse the parent process’s cache instead of rebuilding their own context windows.

There’s also a detail worth expanding. Anyone who has used AI programming tools knows that the longer the conversation, the slower the AI responses get, because each turn requires resending the entire prior history to the model. The usual approach is to delete old messages to free up space, but the problem is that deleting any message breaks the continuity of the cache, causing the whole conversation history to be reprocessed—both latency and costs spike at the same time.

In the leaked code, there’s a mechanism called cache_edits. Instead of truly deleting messages, it marks old messages with a “skip” flag at the API layer. The model can’t see those messages anymore, but the continuity of the cache isn’t broken. This means that after a multi-hour long conversation, if you clear out a few hundred old messages, the response speed in the next turn is almost the same as in the first turn. For ordinary users, this is the underlying answer to “why Claude Code can support conversations of unlimited length without slowing down.”

According to the leaked internal monitoring data (from code comments in autoCompact.ts, dated March 10, 2026), before introducing an upper limit on automatic compression failures, Claude Code wasted about 250,000 API calls per day. There were 1,279 user sessions where continuous auto-compression failures occurred 50 times or more, and the worst session had 3,272 consecutive failures. The fix was just adding one line of restriction: MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3.

So for AI products, the model inference cost might not be the most expensive layer after all—cache management failures are.

44 switches pointing in the same direction

Hidden in the leaked code are 44 feature flags—functionality switches that have already been compiled, but haven’t been released publicly. According to community analysis, these flags are divided into five categories by functional domain, with the most dense being the “autonomous agents” category (12 flags), pointing to a system called KAIROS.

KAIROS is referenced more than 150 times in the source code. It operates as a resident background daemon process mode. Claude Code is no longer just a tool that responds only when you actively call it—it’s an always-on agent running in the background that continuously observes, records, and proactively takes action when the timing is right. The prerequisite is that it doesn’t interrupt the user; any operation that could block the user for more than 15 seconds will be delayed.

KAIROS also has built-in terminal focus awareness. The code contains a terminalFocus field that detects in real time whether the user is currently looking at the terminal window. When you switch to a browser or another app, the agent determines that you’re “not there,” switches to autonomous mode, and proactively executes tasks, directly submits code, and doesn’t wait for your confirmation. When you switch back to the terminal, the agent immediately returns to collaboration mode: first reporting what it just did, then asking for your input. The level of autonomy isn’t fixed—it dynamically fluctuates based on your attention in real time. This solves a long-standing awkward problem for AI tools: fully autonomous AI makes people uneasy, while fully passive AI is too inefficient. KAIROS’s choice is to dynamically adjust the AI’s proactivity according to the user’s attention—if you’re watching, it behaves; if you walk away, it gets work done on its own.

Another subsystem of KAIROS is called autoDream. After accumulating 5 sessions, or after every 24 hours, the agent starts a “reflection” process in the background, going through four steps. First, it scans existing memories to understand what it currently has. Next, it extracts new knowledge from the conversation logs. Then it merges the new and old knowledge—fixing contradictions and removing duplicates. Finally, it compresses the index and deletes outdated entries. This design borrows memory consolidation theories from cognitive science. When humans sleep, they organize memories from the day; when users leave, KAIROS organizes the project context. For ordinary users, this means the longer you use Claude Code, the more precise its understanding of your project becomes—not just “remembering what you said.”

The second biggest category is “anti-steam-distillation and security” (8 flags). The most noteworthy among them is the fake_tools mechanism. When four conditions are simultaneously met (a compile-time flag is enabled, the CLI entry point is activated, a first-party API is being used, and the GrowthBook remote switch is true), Claude Code injects fake tool definitions into API requests. The goal is to contaminate datasets that might record API traffic and be used to train competing models. This is a brand-new defense posture in the AI arms race—not stopping you from copying, but making you copy the wrong things.

In addition, the code also includes a Capybara model codename (split into three tiers: standard, fast, and a million-context-window version), which the community widely suspects is an internal codename for the Claude 5 series.

Easter egg: Hidden in 512,000 lines of code is an electronic pet

Amid all the serious engineering architectures and safety mechanisms, Anthropic engineers also quietly built a complete virtual pet system, internally codenamed BUDDY.

According to the leaked code and community analysis, BUDDY is a visuo-physical terminal pet that appears in the user input box area in the form of an ASCII speech-bubble frame. It has 18 species (including capybaras, salamanders, mushrooms, ghosts, dragons, and a series of original creatures like Pebblecrab, Dustbunny, and Mossfrog). They are divided into five rarity tiers: Common (60%), Rare (25%), Epic (10%), Legendary (4%), and Mythic (1%). Each species also has a “shiny variant.” The rarest Shiny Legendary Nebulynx appears with a probability of only one in ten thousand.

Each BUDDY has five attributes: DEBUGGING (debugging), PATIENCE (patience), CHAOS (chaos), WISDOM (wisdom), and SNARK (snark). They can also wear hats, with options including a crown, a top hat, a propeller hat, an aura, a wizard hat, and even a tiny duck. The hash value of the user ID determines which pet you hatch. Claude generates its name and personality for it.

According to the rollout plan in the leak, BUDDY was originally set to begin its beta test from April 1 to April 7, with a formal launch in May, starting with Anthropic’s internal employees.

With 512,000 lines of code and 98.4% doing hardcore engineering, someone still spent time building an electronic salamander that wears a propeller hat. This might be the most humanizing single line of code in the leak.

View Original

What exactly is the AI Agent doing? A full analysis of Claude Code's 500,000 lines of leaked code

The model is just the tip of the iceberg

Every cache miss multiplies costs by 10

44 switches pointing in the same direction

Easter egg: Hidden in 512,000 lines of code is an electronic pet

Trending Topics

AprilMarketOutlook

CryptoMarketsRiseBroadly

GoldSilverRally

ClaudeCode500KCodeLeak

TrumpSignalsPossibleCeasefire

Hot Gate Fun

per

pear

888888888888

爆仓终结币

bababoyi

bababoyi

APRIL

APRILIA

mtt

mtt sports

Pin