Reverse engineering Claude Code reveals two caching bugs that can silently increase API costs by 10-20 times

robot
Abstract generation in progress

CoinDesk news: according to monitoring by 1M AI News, a developer reverse-engineered the 228MB binary file of the standalone installer version of Claude Code using Ghidra, an MITM proxy, and radare2. They found two separate caching bugs that can raise API costs by 10–20x without users knowing. The related analysis has been submitted to GitHub (issue #40524). Anthropic has marked it as a regression bug and assigned it for handling. The first bug is in the customized Bun runtime used by the standalone installer version. Each time an API request is made, the runtime looks for a billing identifier in the request body and replaces it, but the replacement logic hits the first matching item in the request body. If the conversation history happens to include that string (for example, if the internal billing mechanism of Claude Code was discussed), the replacement will hit the message content rather than the system prompt, causing full cache rebuilding to be triggered on every request. A temporary workaround is to switch to running npx @anthropic-ai/claude-code; the npm package version does not include this replacement logic. The second bug affects all users who restore sessions using --resume or --continue, and it was introduced starting with v2.1.69. When restoring a session, the injection position of system-added information differs from that of a newly created session, causing the cache prefix to fail to match entirely. As a result, the entire conversation history is read from cache to being fully rewritten. Subsequent turns recover normally, but the restore operation itself has already generated substantial additional overhead, and there is currently no external workaround. The developer estimates that for a long conversation of about 500,000 tokens, Bug 1 adds about $0.04 per request, while Bug 2 adds about $0.15 per restore, and combined they can make the single-request cost exceed $0.20. Previously, Anthropic engineer Lydia Hallie confirmed that the speed at which users hit the usage limits is “far faster than expected.” In the Reddit comments, multiple users believe these two caching bugs may be one of the root causes of abnormal usage consumption.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin