Google's AI paper, which caused a collapse in storage stocks and was worth $90 billion, is accused of experimental fraud

For investors in AI infrastructure, when a paper claims to achieve “several orders of magnitude” performance improvement, the first question to ask is whether the benchmark comparison conditions are fair.

Author: Shenchao TechFlow

A Google paper claiming to “compress AI memory usage to 1/6” caused over $90 billion in market value evaporations for global storage chip stocks like Micron and SanDisk last week.

However, just two days after the paper’s release, the comparison method it “crushed” was called into question—postdoctoral researcher Gao Jianyang from ETH Zurich published an open letter of over ten thousand words, accusing the Google team of testing their competitor with a single-core CPU Python script while testing themselves with an A100 GPU, and refusing to correct the issue even after being notified before submission. The reading volume on Zhihu quickly surpassed 4 million, and Stanford NLP’s official account retweeted it, shaking both academia and the market simultaneously.

(Reference reading: A paper that knocked down storage stocks)

The core issue of this controversy is not complicated: did an AI conference paper, heavily promoted by Google and directly triggering panic selling in the global chip sector, systematically distort a previously published work and create a false narrative of performance advantage through deliberately unfair experiments?

What TurboQuant Did: Compressing AI’s “Scratch Paper” to One-Sixth

When generating responses, large language models need to look back at previously calculated content while writing. These intermediate results are temporarily stored in memory, known in the industry as “KV Cache” (key-value cache). The longer the conversation, the thicker this “scratch paper” becomes, resulting in greater memory consumption and higher costs.

The TurboQuant algorithm developed by Google’s research team has the core selling point of compressing this scratch paper to one-sixth of its original size, while also claiming zero loss in accuracy and up to 8 times faster inference speed. The paper was first released on the academic preprint platform arXiv in April 2025, accepted by the top AI conference ICLR 2026 in January 2026, and repackaged and promoted by Google’s official blog on March 24.

From a technical perspective, TurboQuant’s approach can be simply understood as: first using a mathematical transformation to “clean” messy data into a uniform format, then compressing it one by one using a pre-calculated optimal compression table, and finally using a 1-bit error correction mechanism to correct the computational deviations caused by compression. Independent implementations in the community have verified that its compression effect is essentially valid, and the mathematical contributions at the algorithmic level are indeed real.

The controversy is not about whether TurboQuant can be used, but rather what Google did to prove that it “far surpasses competitors.”

Gao Jianyang’s Open Letter: Three Accusations, Each Hitting the Mark

On the evening of March 27, Gao Jianyang published a long article on Zhihu and simultaneously submitted formal comments on the ICLR official review platform OpenReview. Gao Jianyang is the first author of the RaBitQ algorithm, which was published at the top database conference SIGMOD in 2024 and addresses the same type of problem—efficient compression of high-dimensional vectors.

His accusations are threefold, each supported by email records and timelines.

Accusation One: Used Others’ Core Methods Without Mentioning Them.

TurboQuant and RaBitQ share a key common step in their technical core: before compressing data, they first perform a “random rotation.” This operation transforms originally irregularly distributed data into a predictable uniform distribution, significantly reducing compression difficulty. This is the most central and closest part of the two algorithms.

The authors of TurboQuant themselves acknowledged this point in their review responses, yet never explicitly stated the connection of this method to RaBitQ in the full text of the paper. The more critical background is: TurboQuant’s second author Majid Daliri proactively contacted Gao Jianyang’s team in January 2025, requesting help debugging his Python version rewritten from RaBitQ’s source code. The email detailed the reproduction steps and error messages—in other words, the TurboQuant team was well aware of the technical details of RaBitQ.

An anonymous reviewer for ICLR also independently pointed out that both used the same technique and requested sufficient discussion. However, in the final version of the paper, the TurboQuant team not only failed to add a discussion but also moved the already incomplete description of RaBitQ from the main text to the appendix.

Accusation Two: Baselessly Claimed the Competitor’s Theory Was “Suboptimal.”

The TurboQuant paper directly labeled RaBitQ as “theoretically suboptimal,” claiming that RaBitQ’s mathematical analysis was “somewhat rough.” However, Gao Jianyang pointed out that the extended version of RaBitQ’s paper has strictly proven that its compression error achieves the mathematical optimal bound—this conclusion was presented at a top conference in theoretical computer science.

In May 2025, Gao Jianyang’s team detailed the optimality of RaBitQ’s theory through several rounds of emails. Daliri confirmed that all authors were informed. Yet the paper ultimately retained the “suboptimal” phrasing without providing any counterarguments.

Accusation Three: Experiment Comparison with “One Hand Tied, One Hand with a Knife.”

This is the most damaging point in the text. Gao Jianyang pointed out that the TurboQuant paper added two layers of unfair conditions in the speed comparison experiments:

First, RaBitQ officially provided optimized C++ code (default supporting multithreading), yet the TurboQuant team did not use it, instead opting for their translated Python version to test RaBitQ. Second, RaBitQ was tested using a single-core CPU with multithreading disabled, while TurboQuant used an NVIDIA A100 GPU.

The combined effect of these two conditions is: the conclusion readers see is “RaBitQ is several orders of magnitude slower than TurboQuant,” but they have no idea that this conclusion is predicated on the Google team tying their competitor’s hands and feet before the race. The paper did not fully disclose the differences in these experimental conditions.

Google’s Response: “Random Rotation is a Common Technique, We Can’t Cite Every Paper.”

According to Gao Jianyang, the TurboQuant team stated in their email response in March 2026: “The use of random rotation and Johnson-Lindenstrauss transformations has become standard technology in the field; we cannot cite every paper that has used these methods.”

Gao Jianyang’s team believes this is a conceptual switch: the issue is not whether to cite all papers that have used random rotation, but that RaBitQ was the first to combine this method with vector compression under the exact same problem setting and proved its optimality, so the TurboQuant paper should accurately describe the relationship between the two.

The official X account of the Stanford NLP Group retweeted Gao Jianyang’s statement. Gao Jianyang’s team has published a public comment on the ICLR OpenReview platform and submitted a formal complaint to the chair and ethics committee of the ICLR conference, with plans to release a detailed technical report on arXiv later.

Independent tech blogger Dario Salvati provided a relatively neutral assessment in his analysis: TurboQuant does indeed have real contributions in mathematical methods, but its relationship to RaBitQ is much closer than the paper suggests.

$90 Billion in Market Value Evaporation: Paper Controversy Combined with Market Panic

The timing of this academic controversy is extremely delicate. Following Google’s March 24 official blog release of TurboQuant, the global storage chip sector faced a massive sell-off. According to reports from CNBC and other media, Micron Technology fell for six consecutive trading days, with a cumulative drop of over 20%; SanDisk saw a single-day decline of 11%; SK Hynix in South Korea dropped about 6%, and Samsung Electronics nearly 5%, while Japan’s Kioxia fell by about 6%. The market panic logic is simple and brutal: if software compression can reduce AI inference memory requirements by six times, the demand outlook for storage chips will undergo structural adjustments.

Morgan Stanley analyst Joseph Moore countered this logic in a research report on March 26, maintaining “overweight” ratings for Micron and SanDisk. Moore pointed out that TurboQuant only compresses a specific type of cache, the KV Cache, and does not affect overall memory usage, categorizing it as “normal productivity improvement.” Wells Fargo analyst Andrew Rocha also cited Jevons Paradox, suggesting that efficiency improvements leading to reduced costs might instead stimulate larger-scale AI deployments, ultimately increasing memory demand.

Old Paper, New Packaging: Risks in the Transmission Chain from AI Research to Market Narrative

According to tech blogger Ben Pouladian’s analysis, the TurboQuant paper was publicly released in April 2025 and is not new research. However, when Google repackaged and promoted it through its official blog on March 24, the market treated it as a new breakthrough for pricing. This “old paper, new release” promotional strategy, combined with possible experimental biases in the paper, reflects the systemic risks in the transmission chain from academic papers to market narratives in AI research.

For investors in AI infrastructure, when a paper claims to achieve “several orders of magnitude” performance improvement, the first question to ask is whether the benchmark comparison conditions are fair.

Gao Jianyang’s team has made it clear that they will continue to push for a formal resolution to the issues. Google has yet to formally respond to the specific accusations in the open letter.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin