Google's AI paper, which caused a collapse in storage stocks and was worth $90 billion, is accused of experimental fraud

DeepFlowTech · 2026-03-30T06:18:21+00:00

Author: Deep Tide TechFlowLast week, a Google paper claiming to "compress AI memory usage to 1/6" caused over $90 billion in market value to evaporate from global memory chip stocks like Micron and SanDisk.However, just two days after the paper was published, the opposing party—Zurich Federal Institute of Technology postdoctoral researcher Gao Jianyang—published a 10,000-word open letter accusing the Google team of testing competitors with Python scripts on a single-core CPU, while testing their own models on A100 GPUs. He also stated that they had been informed of these issues before submission but refused to make corrections. The post quickly surpassed 4 million reads on Zhihu, was reposted by Stanford NLP's official account, and caused a shock across academia and the market.(Reference reading: A paper that caused memory stocks to plummet)The core issue of this controversy is not complicated: a paper widely promoted by Google that directly triggered

DeepFlowTech

2026-03-30 06:18:21

Author: Deep Tide TechFlow

A Google paper claiming to “compress AI memory usage to 1/6” triggered over $90 billion in market value evaporations for global storage chip stocks like Micron and SanDisk last week.

However, just two days after the paper’s release, the comparison party crushed by the algorithm—Postdoctoral researcher Gao Jianyang from ETH Zurich published a lengthy open letter accusing the Google team of testing competitors with a single-core CPU Python script while using an A100 GPU for their own tests, and after being informed of the issues before submission, they still refused to make corrections. The views on Zhihu quickly surpassed 4 million, and the Stanford NLP official account retweeted it, shaking both academia and the market.

(Reference Reading: A paper that knocked down storage stocks)

The core issue of this controversy is not complicated: Did a paper officially promoted by Google, which directly triggered panic selling in the global chip sector, systematically distort a previously published prior work and create a false narrative of performance advantage through deliberately unfair experiments?

What TurboQuant did: compressing AI’s “scratch paper” to one-sixth its original size

When generating responses, large language models need to write while referring back to previously calculated content. These intermediate results are temporarily stored in VRAM, known in the industry as “KV Cache.” The longer the conversation, the thicker this “scratch paper” becomes, leading to greater VRAM consumption and higher costs.

The TurboQuant algorithm developed by the Google research team claims as its core selling point to compress this scratch paper to one-sixth its original size while claiming zero loss in accuracy and up to 8 times faster inference. The paper was first published on the academic preprint platform arXiv in April 2025, accepted by the top AI conference ICLR 2026 in January 2026, and repackaged and promoted by Google’s official blog on March 24.

On a technical level, TurboQuant’s approach can be simply understood as: first using a mathematical transformation to “clean” the messy data into a uniform format, then compressing it one by one using a pre-calculated optimal compression table, and finally using a 1-bit error correction mechanism to correct computational biases introduced by the compression. Independent implementations in the community have verified that its compression effect is essentially valid, and the mathematical contributions at the algorithm level are indeed real.

The controversy is not about whether TurboQuant can be used, but rather what Google has done to prove it “far exceeds competitors.”

Gao Jianyang’s open letter: three accusations, each hitting the nail on the head

On March 27 at 10 PM, Gao Jianyang published a lengthy article on Zhihu and submitted a formal comment on the ICLR official review platform OpenReview. Gao Jianyang is the first author of the RaBitQ algorithm, which was presented at the top conference SIGMOD in the database field in 2024, addressing the same type of problem—efficient compression of high-dimensional vectors.

His accusations are threefold, each supported by email records and timelines.

Accusation one: Used someone else’s core method without mentioning it.

TurboQuant and RaBitQ share a key common step in their core technology: before compressing data, they perform a “random rotation” on the data. This operation serves to transform originally irregularly distributed data into a predictable uniform distribution, significantly reducing the complexity of compression. This is the most critical and closest part of the two algorithms.

The authors of TurboQuant themselves acknowledged this point in their review responses, yet never directly stated in the paper how this method relates to RaBitQ. More crucially, TurboQuant’s second author Majid Daliri proactively contacted Gao Jianyang’s team in January 2025, requesting help in debugging his Python version rewritten based on RaBitQ’s source code. The email detailed reproduction steps and error messages—in other words, the TurboQuant team was well aware of RaBitQ’s technical details.

An anonymous reviewer from ICLR also independently pointed out that both used the same technique and requested a thorough discussion. However, in the final version of the paper, the TurboQuant team not only failed to add a discussion but also moved the already incomplete description of RaBitQ from the main text to the appendix.

Accusation two: Unsupported claims that the opponent’s theory is “suboptimal.”

The TurboQuant paper directly labeled RaBitQ as “theoretically suboptimal” because it claimed that RaBitQ’s mathematical analysis was “rather rough.” However, Gao Jianyang pointed out that the extended version of RaBitQ’s paper has strictly proven that its compression error reaches the optimal bound mathematically—this conclusion was published in a top conference in theoretical computer science.

In May 2025, Gao Jianyang’s team explained in detail the optimality of RaBitQ’s theory through multiple rounds of emails. TurboQuant’s second author Daliri confirmed that all authors had been informed. Yet the paper ultimately retained the “suboptimal” statement without providing any counterarguments.

Accusation three: Experiment comparisons where “one hand is tied, the other holds a knife.”

This is the most damaging point in the entire text. Gao Jianyang pointed out that the TurboQuant paper imposed two layers of unfair conditions in the speed comparison experiments:

First, RaBitQ officially provided optimized C++ code (which defaults to support multi-threading), but the TurboQuant team did not use it and instead tested RaBitQ with their translated Python version. Second, when testing RaBitQ, a single-core CPU was used with multi-threading disabled, while TurboQuant used an NVIDIA A100 GPU.

The effect of these two combined conditions is that the conclusion readers see is “RaBitQ is several orders of magnitude slower than TurboQuant,” without knowing that the premise of this conclusion is that the Google team tied the opponent’s hands and feet before the race. The paper did not sufficiently disclose the differences in these experimental conditions.

Google’s response: “Random rotation is a universal technique; we can’t cite every paper that uses it.”

According to Gao Jianyang, the TurboQuant team stated in a March 2026 email response: “The use of random rotation and Johnson-Lindenstrauss transformations has become standard technology in the field, and we cannot cite every paper that has used these methods.”

Gao Jianyang’s team believes this is a conceptual switch: The issue is not whether to cite all papers that have used random rotation, but that RaBitQ is the first to combine this method with vector compression under the exact same problem setting and prove its optimality; thus, the TurboQuant paper should accurately describe the relationship between the two.

The official Stanford NLP Group X account retweeted Gao Jianyang’s statement. Gao Jianyang’s team has publicly commented on the ICLR OpenReview platform and submitted a formal complaint to the ICLR conference chair and the ethics committee, with plans to publish a detailed technical report on arXiv in the future.

Independent tech blogger Dario Salvati provided a relatively neutral assessment in his analysis: TurboQuant does have real contributions in mathematical methods, but the relationship with RaBitQ is much closer than presented in the paper.

$90 billion in market value evaporated: paper controversy compounded by market panic

The timing of this academic dispute is quite delicate. Following Google’s official blog release of TurboQuant on March 24, the global storage chip sector faced intense selling. According to reports from multiple media outlets like CNBC, Micron Technology dropped for six consecutive trading days, with a cumulative decline of over 20%; SanDisk saw a single-day drop of 11%; SK Hynix from South Korea fell about 6%, Samsung Electronics declined nearly 5%, and Japan’s Kioxia dropped about 6%. The market panic logic is simple and brutal: if software compression can reduce AI reasoning memory requirements by six times, the demand outlook for storage chips will be structurally revised downward.

Morgan Stanley analyst Joseph Moore rebutted this logic in a research report on March 26, maintaining “overweight” ratings for Micron and SanDisk. Moore pointed out that TurboQuant only compresses this specific type of cache, KV Cache, and does not represent overall memory usage, characterizing it as “normal productivity improvement.” Wells Fargo analyst Andrew Rocha similarly cited Jevons’s Paradox, arguing that efficiency improvements that lower costs could instead stimulate larger-scale AI deployments, ultimately boosting memory demand.

Old paper, new packaging: risks in the transmission chain from AI research to market narratives

According to tech blogger Ben Pouladian’s analysis, the TurboQuant paper was publicly released in April 2025 and is not new research. On March 24, Google repackaged and promoted it through their official blog, but the market treated it as a brand new breakthrough for pricing. This “old paper, new release” promotional strategy, combined with possible experimental biases in the paper, reflects a systemic risk in the transmission chain of AI research from academic papers to market narratives.

For investors in AI infrastructure, when a paper claims to achieve “several orders of magnitude” performance improvement, the first question to ask is whether the baseline comparison conditions are fair.

Gao Jianyang’s team has clearly stated they will continue to push for a formal resolution of the issues. Google has yet to make a formal response to the specific accusations in the open letter.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes