Zhipu AI’s overseas platform reportedly had abnormal cached billing, with the token deduction amount boosted by nearly 19 times

robot
Abstract generation in progress

According to Data Inspection Beating Monitoring, Zhipu AI’s overseas developer platform Z.ai has been exposed for a serious anomaly in its cache billing system. Developer Da7em reported to the official on X that when using the platform’s GLM model, repeated contexts seemingly failed to trigger cache charges properly; instead, the system treated them as entirely new inputs and charged for them repeatedly.

A billing screenshot disclosed by Da7em shows that the actual token usage recorded by the developer tool Zcode was about 270,000, while the platform’s final billing statement indicated that nearly 5,000,000 tokens were consumed—nearly a 19-fold increase in the billed amount. To rule out potential client-side interference, Da7em ran comparative tests with multiple agent development frameworks, including Claude Code, Hermes Agent, Zcode, and OpenCode, confirming that the anomaly manifested consistently across different tools. Therefore, the root cause was determined to be the server-side caching and accounting system.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned