Google AI paper that devastated the storage stocks by $90 billion, accused of experimental fraud

Original author: Deep Tide TechFlow

A Google paper claiming to “compress AI memory usage to 1/6” sparked the storage-chip stocks of Micron, SanDisk, and other global players to lose more than $9 billion in market value last week.

However, just two days after the paper was released, the “crushing” counterpart that the algorithm supposedly “overwhelmed”—Jie Yang Gao, a postdoctoral researcher at ETH Zurich—published a ten-thousand-character open letter accusing the Google team of testing the opponent with a single-core CPU Python script in their experiments, while testing themselves with an A100 GPU, and still refusing to correct the issues even after being told about the problems before submission. The reading count on Zhihu surged past 4 million, Stanford NLP’s official account reposted it, and the academic community and the market were shaken at the same time.

The core of this controversy is not complicated: whether a top-tier AI conference paper that Google officially pushed at large scale—directly triggering panicked sell-offs across the global chip sector—systematically distorted a previously published prior work, and, through deliberately unfair experiments, shaped a misleading narrative of performance advantages.

TurboQuant did what: compress AI’s “draft paper” to one-sixth of the original

When large language models generate answers, they need to both write and then go back to review content that has been computed previously. These intermediate results are temporarily stored in GPU memory; in the industry, this is called “KV Cache” (key-value cache). The longer the dialogue, the thicker this “draft paper,” the more GPU memory it consumes, and the higher the cost.

The TurboQuant algorithm developed by the Google research team’s core selling point is compressing this draft paper to 1/6 of the original, while also claiming zero loss in precision and up to an 8x improvement in inference speed. The paper was first published on the academic preprint platform arXiv in April 2025, accepted by the top AI-field conference ICLR 2026 in January 2026, and re-packaged and promoted again on March 24 via Google’s official blog.

From a technical perspective, TurboQuant’s approach can be understood simply as: first, use a kind of mathematical transformation to “wash” messy data into a uniform format; then, compress each item using a pre-computed optimal compression table; finally, use a 1-bit error-correction mechanism to correct computational deviation caused by compression. Independent implementations by the community have verified that its compression effect is basically true; the mathematical contribution at the algorithm level is genuinely real.

The controversy is not about whether TurboQuant can be used, but about what Google did to prove that it is “far beyond the competitors.”

Gao Jianyang’s open letter: three allegations, each hits the mark

At 10 p.m. on March 27, Gao Jianyang published a long-form post on Zhihu and simultaneously submitted formal comments to the ICLR official peer-review platform OpenReview. Gao Jianyang is the first author of the RaBitQ algorithm, which was published in 2024 at SIGMOD, a top database-area conference. The problem it solves is the same kind—efficient compression of high-dimensional vectors.

His allegations come in three parts, and each one has supporting evidence in the form of email records and a timeline.

Allegation One: used someone else’s core method; didn’t mention it in the full paper.

TurboQuant and RaBitQ share a key common step in their technical core: before compressing the data, they first perform a “random rotation” on the data. The purpose of this step is to turn originally irregular-distributed data into a predictable uniform distribution, thereby greatly reducing the difficulty of compression. This is the most core part and also the closest point between the two algorithms.

TurboQuant’s authors also acknowledged this in their reviewer rebuttal, but they never directly explained in the full text how this method relates to RaBitQ. More importantly, the background is this: TurboQuant’s second author, Majid Daliri, proactively contacted Gao Jianyang’s team in January 2025, requesting help debugging his Python version that rewrote the RaBitQ source code. The email described the reproduction steps and the error messages in detail—meaning the TurboQuant team is extremely familiar with RaBitQ’s technical details.

An anonymous ICLR reviewer also independently pointed out that the two used the same technology and requested thorough discussion. But in the final version of the paper, the TurboQuant team not only failed to add the discussion; instead, they moved the (already incomplete) description of RaBitQ in the main text into the appendix.

Allegation Two: with no evidence, claimed the other party’s theory is “suboptimal.”

The TurboQuant paper directly labels RaBitQ as “theoretically suboptimal,” reasoning that RaBitQ’s mathematical analysis is “somewhat rough.” But Gao Jianyang points out that the RaBitQ extended-version paper has already strictly proven that its compression error reaches the mathematically optimal bound—this conclusion was published at a top conference in theoretical computer science.

In May 2025, Gao Jianyang’s team had, through multiple rounds of emails, explained in detail the optimality of RaBitQ’s theory. TurboQuant’s second author Daliri confirmed that he had informed all the authors. But in the end, the paper still kept the “suboptimal” wording and provided no counterarguments.

Allegation Three: in the experimental comparison, “tying up the left hand and holding a knife in the right.”

This is the most damaging point in the entire text. Gao Jianyang states that, in the speed-comparison experiments, the TurboQuant paper stacked two layers of unfair conditions:

First, RaBitQ’s official implementation provides optimized C++ code (default support for multi-thread parallelism), but the TurboQuant team did not use it; instead, they tested RaBitQ using their own translated Python version. Second, when testing RaBitQ, they used a single-core CPU and disabled multi-threading, while TurboQuant used an NVIDIA A100 GPU.

The combined effect of these two conditions is: readers see a conclusion that “RaBitQ is slower than TurboQuant by several orders of magnitude,” but they have no way of knowing that this conclusion comes from a premise where the Google team first “bound” the opponent and then ran the match. The paper does not adequately disclose the differences in these experimental conditions.

Google’s response: “Random rotation is a general technique; you can’t cite every paper that uses it”

According to what Gao Jianyang disclosed, in the email reply from March 2026, the TurboQuant team said: “The use of random rotation and the Johnson-Lindenstrauss transform is standard technology in the field; we can’t cite every paper that uses these methods.”

Gao Jianyang’s team believes this is a change of concepts. The issue is not whether to cite all papers that use random rotation; rather, RaBitQ is the first work to combine this method with vector compression under exactly the same problem setup and to prove its optimality. The TurboQuant paper should therefore accurately describe the relationship between the two.

Stanford NLP Group’s official X account reposted Gao Jianyang’s statement. Gao Jianyang’s team has published an open comment on the ICLR OpenReview platform and submitted a formal complaint to the ICLR conference chair and the ethics committee; further, they will also publish a detailed technical report on arXiv.

Independent tech blogger Dario Salvati offered a relatively neutral assessment in his analysis: TurboQuant does have real contributions in its mathematical methods, but its relationship with RaBitQ is much more tightly connected than the paper’s wording suggests.

$9 billion in market value evaporated: paper controversy stacked with market panic

The timing of this academic controversy is extremely delicate. After Google released TurboQuant via its official blog on March 24, the global storage-chip sector suffered a severe sell-off. According to multiple media outlets including CNBC, Micron fell for six consecutive trading days, with a cumulative decline of more than 20%; SanDisk dropped 11% in a single day; Korea’s SK Hynix fell by about 6%, Samsung Electronics fell by nearly 5%, and Japan’s Kioxia fell by about 6%. The market’s logic for panic is simple and blunt: software compression can reduce AI inference memory requirements by 6x, and the outlook for demand for storage chips will face a structural downgrade.

Morgan Stanley analyst Joseph Moore pushed back on this logic in a research note dated March 26, maintaining an “overweight” rating on Micron and SanDisk. Moore said that what TurboQuant compresses is only a specific type of cache—KV Cache—not overall memory usage—and characterized it as a “normal productivity improvement.” Wells Fargo analyst Andrew Rocha also cited Jevons’ paradox, arguing that after efficiency improvements reduce costs, they may instead stimulate larger-scale AI deployments, ultimately lifting memory demand.

Old paper, new packaging: the transmission-chain risk from AI research to market narratives

According to tech blogger Ben Pouladian’s analysis, the TurboQuant paper had already been publicly released in April 2025 and is not new research. On March 24, Google repackaged and promoted it again via its official blog, but the market priced it as a brand-new breakthrough. This “old paper, new release” promotional strategy, combined with possible experimental bias in the paper, reflects systemic risks in how AI research transmits from academic papers to market narratives.

For AI infrastructure investors, when a paper claims “performance improvements by several orders of magnitude,” the first thing that needs to be asked is whether the benchmark comparison conditions are fair.

Gao Jianyang’s team has clearly stated that it will continue pushing for an official resolution to the issues. Google has not yet issued an official response to the specific allegations in the open letter.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin