CITIC Securities: Continually optimistic about the growth trend of storage innovation

CITIC Securities research reports say that in the Agent AI era, capacity for storage is the core driver, propelling the storage industry into a long-cycle paradigm shift. On the supply-and-demand side, AI inference drives a sharp surge in Token consumption; KV Cache correspondingly increases linearly. A mismatch between the explosion in demand and original equipment manufacturers’ capacity expansions leads to stock shortages becoming the norm. It is expected that the persistent supply-demand imbalance will continue until 2027, with price increases running throughout all of 2026. On the technical side, against the backdrop of extreme shortages and soaring costs of HBM and DRAM, vendors share NAND innovation solutions to help shoulder the pressure of rising requirements for display-memory capacity. CITIC Securities continues to be optimistic about the storage innovation growth trend.

Full text follows

Storage | Looking at storage development trends from the Flash Memory Summit

In the Agent AI era, storage capacity is the core; it drives the storage industry into a long-cycle paradigm shift. On the supply-and-demand side, AI inference drives a sharp surge in Token consumption, and KV Cache correspondingly increases linearly. The demand surge and the misalignment between original equipment manufacturers’ capacity expansions result in stock shortages becoming normalized. It is expected that the imbalance of supply and demand will continue until 2027, and price increases will run throughout all of 2026. On the technical side, against the backdrop of extreme shortages of HBM and DRAM and persistently high costs, vendors share NAND innovation solutions to help shoulder the pressure of demand for display-memory capacity. We continue to look favorably on the storage innovation growth trend.

The 2026 China Flash Memory Market Summit will be held, focusing on opportunities for storage innovation and industrial-chain upgrades in the AI era.

On March 27, 2026, CFMS MemoryS 2026, the global storage industry’s annual event, will be held in Shenzhen. As an industry bellwether-level summit, this year’s event centers on the core theme “Crossing the cycle and unlocking value.” It deeply focuses on technological innovation and coordinated upgrades across the industrial chain, attracting dozens of global leading companies including Samsung Electronics, Everspin? (actually “慧荣科技” is Silicon Motion), Micron? (铠侠 is Kioxia), Solidigm, Intel, Tencent Cloud, and others. The event covers the entire industrial chain, including storage-chip OEMs, controller design, module manufacturing, and cloud services. Through a dual-track setup of high-end forums and technical exhibitions, the summit explores, as the industry evolves, outlooks on favorable trends, focusing on the surge in storage capacity demand under the sharp rise of Agent AI-era token/KV Cache. It also conducts forward-looking discussions on frontier breakthroughs in PCIe 5.0/6.0 SSDs, ultra-high-capacity QLC technology, and other AI-driven storage innovation changes, while simultaneously showcasing more than 100 innovative products.

▍ AI inference drives a surge in storage demand; structural mismatches become the norm. It is expected that the supply-demand shortfall will persist at least until 2027, with price hikes running throughout all of 2026.

Demand side: According to CFM China flash memory market data, server shipments in 2026 will grow year over year by +15%. AI servers’ share in total server shipments will exceed 20%. As large models move from the training stage into the inference stage, the explosion of Agent applications leads to a sharp increase in Token consumption. When sequence length increases from 1k to 128k tokens, KV Cache occupancy rises from 0.5GB to 64GB (BF/FP16, per request). Under long-context and high-concurrency conditions, storage demand rises sharply and linearly with token and concurrency volume. CFM forecasts that HBM capacity will increase by more than +90%/+35% year over year in 2025/2026, respectively. Meanwhile, KV Cache downshifting combined with insufficient HDD supply drives demand spillover, making eSSD the largest downstream NAND market in 2026 (with share rising to 37%).

Supply side: Capacity expansion cycles are out of sync, and shortage-related price increases will remain long-term. Storage OEMs generally adopt a strategy of stabilizing prices. Advanced capacity is prioritized for high-gross-margin AI storage products. According to CFM, the proportion of relatively higher-end DRAM capacities, including HBM/DDR5/LP5X/6, rises from less than 50% in 2024 to 85%+ in 2026. Meanwhile, mature process and consumer-grade capacities are continuously squeezed. Industry inventory declines from 10~12 weeks in Oct~Dec 2023 and 8~10 weeks in Aug~Oct 2024 to 4 weeks in 2026, falling below historical safety lines. Storage capacity expansion cycles are as long as 18~24 months, so a supply inflection point cannot appear in 2H26. Silicon Motion? (慧荣科技) believes that 2027 is when storage shortages reach their “darkest moment.” Starting in 2H25, storage prices will enter an epic round of increases. CFM expects DRAM and NAND ASPs to continue rising throughout all of 2026. In the AI inference era, storage capacity is the core, and the storage industry is entering a long-cycle paradigm shift. For super growth, this is not a periodic rebound.

▍ The storage industry chain is accelerating value reconfiguration.

At the recent GTC conference, NVIDIA emphasized “Token factory economics.” Its core significance is to strengthen storage’s strategic position in AI infrastructure, which also means that the profit ceiling of the storage industry will be opened up for the long term. According to CFM data, eSSD product ASP in 26Q1 is already more than 2 times the consumer-grade NAND ASP. For storage OEMs, the key lies in upgrading the media and a system-architecture-level reconfiguration. This forum’s presentations mainly focus on the enterprise market. For storage solution vendors, the industry focus shifts from “who is cheaper” to “who can get product.” At the same time, leading vendors such as Phison are accelerating a transition to “customized, high–added-value modules” empowered by self-developed controllers, and expanding enterprise SSDs to redefine storage value and move away from the traditional model of relying on low-cost inventory.

▍ AI cloud (enterprise) storage trend: the breakout of high-capacity QLC and rapid interface evolution, reshaping the compute engine.

AI is accelerating from the “training” stage to the “inference” stage. In the future, the ratio of inference to training servers is expected to be as high as 10:1 to 50:1. At present, constrained by storage bandwidth bottlenecks, the availability (utilization rate) of GPU clusters is only about 46% to 50%. Upgrading display memory is the core demand. Meanwhile, at this summit, multiple vendors shared functionality related to compute-and-storage collaboration and functional redistribution. The role of eSSD is moving from a “passive data container” to a core “compute engine” and an “expanded memory layer.” On the training side, by relying on ultra-high-capacity QLC eSSD to store checkpoints, GPU operating efficiency can be greatly improved. On the inference side, eSSD takes on tasks by using tiered caching of KV Cache to handle massive long-context state management, vector database queries, and model shard loading. Measured results show that offloading KV cache to SSD—by eliminating the prefill computation—can reduce the first token generation time (TTFT) by 41 times. Enterprise storage is showing the following technical trends:

Facing the overflow caching demand from massive AI data and KV Cache, high-density QLC has become the key medium. Ultra-high-capacity QLC solutions at the hundred-terabyte level are the first choice. Kioxia (245.76TB), Dawuwei? (大普微) (245TB), and SanDisk (SN670 solutions up to 256TB) have also showcased ultra-high-capacity QLC products in excess of 200TB, greatly optimizing space efficiency and TCO.

Controller chips move toward “software-hardware co-design,” filling the gap in the media. To address the high-frequency random read/write and bandwidth pressure brought by KV Cache in inference scenarios, controller chips are proactively upgrading. Heping? (平头哥) Zhenyue 510 (镇岳510) supports ZNS protocol natively and coordinates at the system level to help QLC achieve large-scale commercial use, with cumulative shipments exceeding 500,000 units. Al? (联芸科技) introduces KV acceleration engines, predictive prefetching, and other technologies, enabling the controller to transform from a “data mover” into an active “intelligent resource dispatcher.”

Rapid interface iteration and liquid cooling innovation adapt to ultra-large 100,000-GPU clusters. In response to the massive data throughput and high-density heat dissipation challenges of clusters ranging from thousands to tens of thousands to even 100,000 GPUs. Samsung demonstrated a 16-channel PCIe 6.0 solid-state drive PM1763, with input/output performance increasing by 2.0x; FADU’s PCIe Gen6 controller “Lhotse” has taped out, and sequential read performance is expected to reach 28.5GB/s.

▍ AI endpoint (consumer-grade) storage trends: endpoint AI acceleration lands, and compute-storage fusion breaks the memory-occupation bottleneck.

The endpoint environment is extremely demanding in terms of hardware BOM costs, system power consumption, and DRAM memory usage. Therefore, shifting inference pressure from memory (DRAM) to flash (NAND) through “compute-storage fusion,” intelligent scheduling of software and hardware, and advanced caching techniques has become an important supplementary approach to overcoming the bottleneck of deploying large models on the edge.

AI PCs and local large models: Hybrid technology reduces the stress of rapidly increasing DRAM capacity requirements. Running large models with tens of billions or hundreds of billions of parameters on the edge is a huge test for memory. Jiangbo? (江波龙) introduced a storage processing unit with a 5nm SPU and an iSA storage agent. In joint tuning and validation, it achieves local deployment of a 397B model on a PC host, and in 256K context scenarios reduces DRAM usage by nearly 40%; Phison Electronics introduced the Phison Hybrid AI SSD and aiDAPTIV+ technology, which is expected to reduce DRAM usage by more than 50%, enabling cost to remain controllable while maintaining safe local inference.

Smart automobiles and edge computing: Moving toward a centralized pooling architecture and a unified platform foundation. Embodied intelligence and advanced driving assistance place global coordination requirements on the underlying architecture. XPeng Motors? (小鹏汽车) stated clearly that under compute capability up to 2250 TOPS, DRAM bandwidth has become the core bottleneck for inference latency. The automotive LPDDR6 era is about to arrive, and in-vehicle NAND storage is moving from segmented “isolated islands” toward centralized pooling and software-defined approaches.

Smartphones and AIoT (Internet of Things): Deeply embedding high-speed interface and advanced caching technologies. For mobile devices and emerging wearables, response speed and battery life requirements are paramount. Silicon Motion is set to release a new generation UFS 4.1 controller SM 2755 and accelerate its layout in AIoT markets such as smartwatches and smart glasses. SanDisk uses SmartSLC caching technology to achieve high-throughput UFS 4.1 operation at only around 2W power consumption. Jiangbo? (江波龙) is also promoting the HLC advanced caching technology to be deployed on embedded endpoints, reducing endpoint BOM costs.

▍ Risk factors:

Risks from a sluggish global macroeconomic environment; downstream demand falling short of expectations; innovation falling short of expectations; risks from changes in the international industrial environment and intensifying trade frictions; risks that the compute capacity upgrade progress falls short of expectations; and risks that cloud vendors’ capital expenditures fall short of expectations, etc.

▍ Investment strategy:

We are optimistic about the trend in the compute-and-storage industry as storage capacity increases in the Agent AI era. Compute-near memory (near-memory) computing has high industry momentum; we are bullish on the HBM and CUBE industry chains. Meanwhile, with storage shortages, mainstream as well as niche storage will experience broad shortages and price increases. Feedback from multiple vendors indicates that the 26Q2 price increase in quarter-over-quarter terms remains roughly similar. We expect the industry’s supply-demand imbalance—shortage—to persist at least until the end of 2027. Core recommendations: storage module companies with strong short-term earnings breakout capability; storage OEMs and design companies that are close to the OEMs.

(Source: Jiemian News)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin