Storage was hit with a "blow to the head," but AI hasn't let go of its grab for the market yet.

MaticHoleFiller · 2026-04-03T06:39:46+00:00

> Trading stocks? Look to Golden Kylin Analyst Reports—authoritative, professional, timely, comprehensive—helping you uncover potential thematic opportunities!（Source: China Fund News）"At this stage, locking in capacity is more important than discussing prices."Author: Niu SiruoA technical announcement from Google about “memory usage reduced to one-sixth of the original” has sent a chill through the global storage sector.Recently, Google launched TurboQuant compression algorithm, which reduces the resource-intensive “key-value cache” (KV Cache) space requirement during AI inference to one-sixth of the original without sacrificing model accuracy, and increases attention calculation speed by 8 times.Once the news broke, the market quickly interpreted it as “a potential impact on the overall demand for storage chips in artificial intelligence,” and the ripple effects spread rapidly.

MaticHoleFiller

2026-04-03 06:39:46

If you want to trade stocks, just look at the Jin Qilin analyst research reports—authoritative, professional, timely, and comprehensive. Help you uncover high-potential theme opportunities!

（Source: China Fund News）

“At this stage, locking in production capacity is more important than talking about prices.”

Author: Niu Siruo

A Google technical update about “memory usage reduced to 1/6 of the original” sent the global storage sector into a literal “cold sweat.”

Recently, Google rolled out the TurboQuant compression algorithm, which—without sacrificing model accuracy—reduces the space requirement for the “key-value cache” (KV Cache), the most resource-intensive part during the AI inference stage, to 1/6 of the original, while boosting attention computation speed by 8x.

Once the news broke, the market quickly interpreted it as “something that will impact overall demand for AI storage chips,” and it was rapidly transmitted to the capital markets, causing storage chip concept stocks to move lower across the board.

Meanwhile, the topic of “a cliff-like drop in memory stick prices” surged to the top of the hot search list. With channel prices loosening, technical disruptions, and a pullback in the sector all intertwined, the market couldn’t help but start asking again: has this storage cycle already reached a turning point?

Local “loosening” from the channel side

This round of widely discussed memory stick price declines has occurred mostly in spot channels, targeting individual players with their own custom-build demand.

The market’s own scale is limited, making it more sensitive to price fluctuations and consumer sentiment. Channel distributors have to watch upstream quotes, and they also need to consider whether C-end users will actually buy.

A shop owner in Huaqiangbei who specializes in memory sticks told the reporter that starting from last Wednesday, multiple memory products began to be discounted. Currently, the price of 16G memory has fallen from around 900 yuan last week to around 700 yuan, and the price of 32G memory has also basically dropped by about 300 yuan.

Price plunges and stock market crashes can indeed create a “the market is topping” illusion. In the view of industry insiders, this is more like a brief correction in the channel market after overly rapid gains earlier, rather than a reversal of the industry trend.

“Because the earlier price increases were so large, channel customers’ resistance to high-priced storage products has been strengthening day by day, and actual transactions in the market have become difficult. More importantly, on the spot trading side, traders want to pull in cash and realize profits; they have sold off a large amount of low-end DDR4 memory sticks, which further pressures and knocks down the channel market.” A market analyst said.

But from the contract market, the picture looks different. As the person revealed, in this year’s first quarter, contract prices for original-factory server and PC NAND and DRAM all rose with a doubling-like growth pattern.

The person said that storage products currently can’t fully meet market demand. The shortage of storage supply is unlikely to be relieved in the short term, so the channel market’s price pullback will not rewrite the overall upward logic of the storage industry.

“No memory, no AI”

Almost at the same time, at the MemoryS 2026 conference in Shenzhen, the air was filled not with the same mood, but with a different one.

“Everyone asks me whether we have inventory; and even if they ask only whether we have inventory, they don’t ask about price.” A sales representative from a storage exhibitor told reporters with a wry smile, “But right now we can only meet about 30–40% of demand. If we run into orders that are too large in quantity, we can only turn them away.”

The market is worried about demand cooling, but what the industry conference现场 felt was still tight supply. In a packed venue, “how much longer will the storage industry face shortages?” became one of the hottest topics.

Xie Wei, general manager of the Flash Market, said: “AI is not just a trend—it’s a bottom-layer revolution. It is turning storage from a cost item on a BOM bill of materials into a strategic resource for AI competition; from a cyclical product into a core competitive advantage for the digital economy.”

This is not an exaggeration.

Whether it’s training, inference, or fine-tuning for large models, or multi-modal applications, every step pushes storage bandwidth and capacity to the limit. HBM has gone from a niche high-end product to “oil” in the AI era in one leap. High-capacity DDR5 memory has shifted from an optional configuration to the standard configuration for AI servers. Enterprise SSDs are not just carriers of capacity either—they are the key to breaking performance bottlenecks within the entire compute-architecture.

Xie Wei explained that during inference of large models, it’s necessary to store the Key Value results for each layer and each token to avoid repeated computation and shorten response time. When context length is extended from 4K tokens to 128K tokens, the KV cache space requirements expand multiple times; and when requests arrive with high concurrency added on top, the demand scale rises quickly, and it becomes difficult for HBM alone to handle, so more and more pressure starts shifting toward NVMe SSDs.

“Precisely for this reason, demand for SSDs optimized for AI inference workloads is growing extremely fast, and eSSD has become the largest NAND application market in 2026.” Xie Wei judged.

“No memory, no AI.” Pan Jiancheng, CEO of Phison Electronics, was even more direct. In his view, Google’s compression algorithm doesn’t mean storage demand will collapse linearly. On the contrary, compression technology means lower host costs and higher shipment volumes, and it also means users can produce more tokens, bringing more storage and access demand.

Morgan Stanley also believes that by sharply lowering the service cost for each query, TurboQuant enables model migration from expensive cloud clusters to local environments, effectively lowering the threshold for AI scaling deployment—this may actually further boost overall demand.

Expansion must wait; shortage is hard to solve

“Although storage original-factory manufacturers have already started increasing new capital expenditures and expanding capacity, the capacity expansion cycle for the storage industry lasts 18 to 24 months, and the earliest new production capacity won’t be released until 2027.” Xie Wei told the reporter that the issue of storage supply shortages will be difficult to ease in the short term.

In his view, in 2026 there is not any mainstream AI storage product globally that can achieve complete balance between supply and demand. The focus of the storage industry has shifted from “who can offer cheaper prices” to “who can get the goods.”

“At this stage, locking in production capacity is more important than talking about prices.” Xie Wei said plainly.

An executive at the storage controller leader, Huiding Technology, also said that 2026 isn’t yet the darkest hour. In 2027, the supply-demand gap will widen because this round of price increases and shortages is not just a simple cyclical fluctuation, but a structural transformation driven by AI. After all, the massive data generated by AI training and inference creates unprecedented demand for storage.

Upward pressure on one side, rising to higher ground on the other

And that’s why a more realistic split is starting to emerge.

For traditional consumer markets like smartphones and PCs, higher storage prices first show up as cost pressure. Some storage vendors have begun taking a “value-for-money” route, trying to deliver higher equivalent experiences with less memory.

For example, Jianghai Longcheng is working to drive full-scenario deployment of AI PCs and embedded systems by integrating HLC advanced caching technology deeply with SPU and UFS. While optimizing the AI experience, it lowers the terminal’s DRAM capacity demand and cost; Phison Electronics launched Phison Hybrid AI SSD along with aiDAPTIV+ technology, and it is expected to reduce DRAM usage by more than 50%, enabling cost control and safe local inference.

On the other side, everyone is also moving “upward” collectively—resources and capacity are being prioritized toward higher-tech, higher-value, higher-barrier products.

Previously, the spotlight in the AI industry focused on “training.” Compute clusters’ throughput is certainly impressive, but demand is often stage-based. Now, the industry’s focus is shifting across the board to “inference”—a “bottomless pit” with higher frequency, finer granularity, and closer ties to real commercial traffic.

According to the latest data from the State Administration of Data, in March this year, China’s daily average token calls exceeded 140 trillion. In the view of Jensen Huang, intelligent-agent AI is expected to increase token consumption by 1,000 times, thereby creating what he calls a “compute power vacuum.”

Xie Wei said directly: “We can confirm one thing: whoever can solve the power consumption and latency of data transportation in the AI era will define the next decade. Storage will enter an AI-driven supercycle.”

Zhang Shiw an, Executive Vice President of Samsung Electronics and head of the Solution Platform Development Team, said that high-performance storage is no longer an optional choice, but a core foundation that determines system decision-making efficiency and scale. Based on this judgment, Samsung is pushing forward with the PCIe Gen6 SSD PM1763 and plans to release higher-density EDSFF drives in 2026 to 2027 to increase single-device capacity and bandwidth.

Tan Hong, head of the SSD business division at Yangtze Memory, mentioned that the AI race has moved from the training stage focused on “building depth” to the inference stage focused on “early output.” Storage bandwidth bottlenecks are severely constraining compute power release; currently, GPU cluster availability is only about 50%.

In his view, the way to break the bottleneck lies in compute-storage co-optimization: on the training side, it can rely on large-capacity QLC eSSD storage for checkpoints to improve GPU efficiency; on the inference side, it can use eSSD layered caching of KV Cache to take on context state management. For such scenarios, Yangtze Memory has launched multiple new Gen5 enterprise eSSD products.

For storage vendors, the real exam is not whether they can raise prices, but whether they can stand on the layer with higher value.

From the price war to the value war; from single products to full-stack solutions; from being a follower of compute to the “winning hand” for AI— in this race where AI rewrites the rules, retail prices may occasionally loosen, but that’s only ripples on the water surface. Deeper down, the chips are still scarce, and they are both scarce and extremely expensive.

Editor: Huang Mei Proofreader: Wang Yue Review: Chen Siyang

A massive amount of information and precise analysis—right on the Sina Finance APP

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.