Is the adjustment in the storage sector a case of wrongful punishment or a collapse of logic?

This week, a new technology from Google has directly stunned the storage sector. Taking Micron Technology as an example, it has experienced five consecutive days of declines, and related A-share targets have also seen adjustments.

TurboQuant, which claims to reduce long text KV Cache storage needs by six times and increase inference speed by eight times, caused storage stocks to collectively plunge after its release, with many shouting that “AI storage demand is about to disappear.”

But if you think about it carefully, doesn’t this sound a bit familiar? When the GQA technology came out in 2023, some said the KV Cache demand would be halved, and storage would cool down; when PagedAttention came out in 2024, the same argument emerged again.

So what happened? Over the past two years, the global token consumption of large models has increased by more than ten times, and the demand for storage has actually surged.

01 The Truth About TurboQuant: Compression Is Not Necessarily Bad News

Many people hear “6 times compression” and think, doesn’t that mean the storage demand is reduced by six times? Isn’t that a death sentence for storage manufacturers?

If you think this way, you’ve misunderstood the logic of this technology.

Simply put, KV Cache is what large models use to store previous dialogue content during inference; otherwise, they forget what you said before. This takes up most of the storage demand during the inference phase. The compression done by TurboQuant is meant to address the core bottleneck of AI inference—the memory wall.

Currently, the context for large models has increased from the past 4K to 128K, or even millions, and the number of concurrent inference requests is also on the rise. If no compression is done, even if all the HBM is piled up, it wouldn’t be enough. Moreover, the cost of data transportation is much higher than the cost of computation, which slows down the inference speed.

So, compression is intended to enable long context and high-concurrency inference to run, not to reduce storage usage.

Furthermore, this is not a new phenomenon; the industry has long begun to iterate.

GQA in 2023 compressed KV Cache by 4-8 times; in 2024, quantization and PagedAttention again compressed it by 2-4 times. Each time, people claimed that storage demand would disappear, but what was the result?

After each compression, everyone felt free to extend the context, to handle more concurrency. The long-text inference that was previously unaffordable is now usable, and the new demand actually fills up the space saved by compression, even more so.

This is the Jevons Paradox in economics, most notably seen in video compression: when H.264 and H.265 came out, they cut the storage demand for unit video by more than half, and as a result, everyone started creating 4K and 8K high-definition videos. Even now, a 10-minute long video easily exceeds 10GB, and ultimately, the total demand for video storage has increased by dozens of times.

The same reasoning applies to TurboQuant. A 6-time compression may seem significant, but look at the current demand growth: in February 2026, the global token consumption of large models is ten times that of the same period last year. By 2028, the global data volume is expected to rise to 394ZB, more than five times that of 2020. This level of compression is merely a drop in the bucket in the face of exponential demand growth.

More importantly, the cost reduction brought by compression will unleash more new demand.

Previously, long context inference was too expensive for many enterprises. Now that costs have dropped, they are willing to use it, and cloud vendors can also relax the context and concurrency limitations. In the end, the total storage demand will likely be further amplified.

In short, TurboQuant is an optimization on the supply side, not a disappearance of demand. It is an optimization made to alleviate the memory wall in the context of insufficient HBM supply.

In the short term, the gap between HBM supply and demand remains, and this gap may even widen due to the release of new demand.

02 Long Cycle Prosperity Meets Geopolitical Black Swan

In fact, before the fluctuations caused by TurboQuant, the storage industry had already entered a super prosperous cycle, with a tight balance between supply and demand reaching its peak.

On the demand side, the explosion of AI has already pulled storage demand to unprecedented heights.

In the past, storage demand relied on PCs and smartphones; now, AI servers and multimodal applications have become the new engines.

Byte’s Seedance 2.0 consumes tens of times the tokens for a 10-minute video compared to text, and Nvidia’s new architecture has directly driven NAND demand from TB-level to PB-level, with single rack capacities increasing fivefold.

Global internet giants are frantically ramping up computing infrastructure. By 2026, capital expenditure from the eight major CSPs is expected to rise by 25% to $500 billion, most of which will be invested in AI infrastructure, with storage being one of the most core and essential needs.

On the supply side, the three major overseas storage giants—Samsung, SK Hynix, and Micron—have already tightly constrained their production capacity.

Having experienced losses in the previous cycle, their current capacity expansions are very cautious, and all new capacity is focused on high-margin high-end products like HBM and DDR5, while production capacity for low-end DRAM and NAND is actually shrinking.

What’s worse is that high-end HBM production capacity cannot be expanded at all.

Building a clean room takes 8-12 months, and ramping up yield takes even longer.

Currently, the inventory of the three major manufacturers has reached a historical low of 3-5 weeks, meaning that after they sell out their current stock, the next batch has not yet been produced. The supply has become extremely rigid.

This tight balance between supply and demand has already caused storage prices to rise for several months.

Micron’s latest financial report is the best proof: in FY26Q2, its revenue skyrocketed to $23.86 billion, a year-on-year increase of 196%, with net profit reaching $14.021 billion, a year-on-year increase of 686%, and an operating gross margin of 69%. This is the power of a super cycle.

And just at this moment, the conflict in Hormuz has directly added fuel to this already extremely tight supply and demand.

You have to understand that most of the world’s storage capacity is in South Korea, with Samsung and SK Hynix accounting for 70% of global DRAM capacity. Moreover, South Korea relies on the Middle East for 70% of its oil imports, which mostly comes through the Strait of Hormuz.

What’s more critical is that essential rare gases for storage production, such as helium, are 64.7% dependent on Qatar, and Qatar’s helium production has already stopped, cutting global supply by 30%. Additionally, the majority of neon gas comes from Iran, and now these resources have become ticking time bombs for the supply chain.

This is the current state of the storage industry: the long logic is the super demand cycle brought by AI, with rigid supply and continuously rising prices; the short logic is that the geopolitical conflict in Hormuz has directly targeted the overseas storage capacity, making the already tight supply even tighter.

03 Who Can Catch This Wave of Global Supply Gap?

Many people may ask, where are the opportunities for domestic storage at this time? Which segments are most worthy of our attention?

It’s actually quite simple: focus on two core aspects: First, identify those segments where overseas supply is most susceptible to geopolitical interference; second, find those domestic leaders who have already made technological breakthroughs and are ready with production capacity. Only they can catch this sudden surge in demand.

First, consider high-end HBM and DRAM-related segments.

Think about it: Samsung and SK Hynix’s capacities have already been locked in by Nvidia. If their production capacity encounters issues due to energy or raw material problems, who will fill that gap?

Of course, it will be the upstream domestic storage chip foundries, which are currently rapidly expanding capacity and have improved yield. They are also pushing for HBM development, and if overseas supply encounters issues, customers may proactively accelerate the verification of domestic products.

Next is the midstream storage module segment.

For example, companies like Jiangbolong and Baiwei Storage already have mature customer channels. If there are issues with overseas wafer supply and prices rise, they can rely on domestic wafer capacity to provide customers with more stable and cheaper storage products.

In the past, everyone thought that overseas supply chains were very stable and were reluctant to take risks by switching to domestic products. However, the geopolitical conflict has directly sounded the alarm for everyone: So overseas supply chains can also break? So putting all production capacity overseas is such a big risk?

Moreover, rising oil prices have caused overseas companies’ costs to soar, making domestic companies’ cost-performance advantages even more apparent.

In conclusion, the long-term logic for storage is the super demand cycle brought by AI and a decade-long domestic substitution journey; the short-term catalyst is this geopolitical conflict, which has accelerated the entire process.

However, considering that the overall increase in the sector has already been significant, it may have fully reflected the market’s optimistic expectations, so subsequent investments should remain vigilant to multiple risks:

Risk of AI development not meeting expectations: Current AI-driven storage demand continues to rise; if advancements in large model technology do not meet expectations, there is a risk of downward adjustments in AI Capex, which would affect demand.

Risk of storage price decline: Due to skyrocketing storage prices, there are speculative hoarding phenomena in circulation channels. If excessive speculation affects downstream demand, there is a risk of price drops.

Risk of R&D progress not meeting expectations: Storage companies need to continuously upgrade and innovate products. If strategic choices fail, there is a risk of research and development failures.

04 Conclusion

Looking back at the history of the global storage industry, every geopolitical conflict accelerates the restructuring of supply chains; every technological revolution gives rise to entirely new storage demands.

Right now, we are standing at the intersection of the AI revolution and supply chain restructuring, and storage is precisely the core track pointed to by these two major waves.

Of course, investment in any track cannot be smooth sailing. The storage industry still faces risks such as changes in overseas trade policies, intensifying industry competition, and technological iterations that may not meet expectations. This requires us to continuously track industry changes, discern the real from the false, and identify truly competitive enterprises.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin