Google Gemini API sparks “cache billing vulnerability”: developers deleting invalid data get hit with a hefty 20,000 reais

Google AI Developer Forum recently revealed a serious API billing anomaly disaster. A developer posted a plea for help, pointing out that the Gemini 3 Flash text cache (Context Caching) feature they used continued to incur charges at an astonishing rate of over a thousand yuan per hour after deleting and clearing via the API on the frontend. In just a few days, the accumulated bill approached 20k Brazilian Reais (roughly several thousand USD). The developer has now been forced to fully disable the Gemini API service to stop the bleeding, and this incident has sparked significant concern within the developer community.
(Background summary: Trump calls for investment in American AI companies, possibly negotiating with OpenAI, Anthropic, xAI this week; Altman proposes a "public wealth fund" concept)
(Additional background: Before SpaceX IPO, a large order: Google pays $920 million monthly to rent 110k NVIDIA GPUs)

Table of Contents

Toggle

  • Cache deletion still continues to incur charges! Thousands per hour
  • Emergency shutdown of API to stop bleeding, official fix not yet provided
  • Developer community panic, caution needed when using cache features

The hidden costs of AI large model APIs have always been a major concern for developers, but recently Google’s latest Gemini API has exposed a shocking "ghost billing" vulnerability. On the Google AI Developer Forum, a post titled "Urgent: Huge Cache Cost Increase Issue (Part 2)" revealed that the Gemini 3 Flash cache service (Context Caching) appears to have a serious runaway issue in its backend billing mechanism.

Cache deletion still continues to incur charges! Thousands per hour

According to detailed BigQuery billing data provided by developer Danilo_Oliveira, this abnormal event started on June 3, 2026. Initially, the cost for Gemini 3 Flash's "cache text storage token hours (SKU ID: 583D-5DB6-4555)" remained around 20 to 30 BRL per hour, with usage about 4 million token hours.

However, by June 6, the situation rapidly worsened, with costs exploding exponentially. A single hour’s usage exceeded 200 million token hours, with hourly charges surpassing 1,000 BRL. By early June 7, a total of 341 abnormal billing incidents had caused the total bill to soar to 17,847.21 BRL, indicating the billing system had completely lost control.

Emergency shutdown of API to stop bleeding

Faced with the skyrocketing bills, the developer took all possible precautions. They immediately shut down scripts generating the cache, and used Google’s official REST API to verify that the cache list on the frontend had been "completely cleared." Yet, the heartbreaking part was that even after the frontend showed no cache remaining, the backend system continued to deduct charges uncontrollably.

Suspecting a bug caused by Google’s backend servers failing to properly clear cache records, the developer urgently opened billing issue ticket #720261 to seek official assistance. To prevent the financial black hole from expanding, they ultimately had to take the drastic step of completely disabling the entire Gemini API service within their Google Cloud project.

Developer community panic, caution needed when using cache features

After this incident was exposed on the forum, it quickly drew attention and discussion among industry peers. Since the cache function (Context Caching) was originally intended to address the cost and latency issues of processing ultra-long texts with large language models (LLMs), it has now become a black hole devouring funds. This has undoubtedly dealt a cold shower to enterprises and individual developers preparing for large-scale adoption of Gemini API.

Until Google officially fixes and publicly explains this backend vulnerability, the community strongly recommends developers currently using the Gemini API cache feature to closely monitor their Google Cloud real-time billing, and set strict budget caps and alert mechanisms to avoid waking up to an unaffordable bill.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments