CITIC Securities: Continually optimistic about the growth trend of storage innovation

CryptocurrencySniper · 2026-03-30T00:41:43+00:00

CITIC Securities research report states that the core of the Agent AI era is persistent storage, driving a long-term paradigm shift in the storage industry. On the supply and demand side, AI inference significantly increases token consumption, leading to a linear surge in KV Cache. The mismatch between demand explosion and original manufacturer capacity expansion has resulted in normal shortages, which are expected to continue until 2027, with price increases throughout 2026. Technologically, amid extreme shortages and high costs of HBM and DRAM, manufacturers are sharing NAND innovation solutions to alleviate the pressure on graphics memory capacity demand. CITIC Securities remains optimistic about the growth trend driven by storage innovation.Full text belowStorage | Observing Storage Development Trends from the Flash Market SummitAgent

CryptocurrencySniper

2026-03-30 00:41:43

CITIC Securities research report states that in the Agent AI era, storage capacity is core, driving the storage industry to embrace a long-cycle paradigm shift. On the supply and demand side, AI inference has led to a dramatic increase in Token consumption, with KV Cache experiencing linear growth, causing a normal state of shortages due to mismatches between demand surges and factory expansions. It is expected that supply will continue to fall short until 2027, with price increases persisting throughout 2026. On the technology front, amid extreme shortages and high costs of HBM and DRAM, manufacturers are sharing NAND innovation solutions to alleviate the pressure on memory capacity demand. CITIC Securities remains optimistic about the growth trend of storage innovation.

The full text is as follows

Storage | Trends in Storage Development from the Flash Memory Market Summit

In the Agent AI era, storage capacity is core, driving the storage industry to embrace a long-cycle paradigm shift. On the supply and demand side, AI inference has led to a dramatic increase in Token consumption, with KV Cache experiencing linear growth, causing a normal state of shortages due to mismatches between demand surges and factory expansions. It is expected that supply will continue to fall short until 2027, with price increases persisting throughout 2026. On the technology front, amid extreme shortages and high costs of HBM and DRAM, manufacturers are sharing NAND innovation solutions to alleviate the pressure on memory capacity demand. We remain optimistic about the growth trend of storage innovation.

▍ The 2026 China Flash Memory Market Summit will be held, focusing on storage innovation and industrial chain upgrade opportunities in the AI era.

On March 27, 2026, the global storage industry annual event CFMS MemoryS 2026 will be held in Shenzhen. As a barometer-level summit for the industry, this event focuses on the core theme of “Crossing Cycles, Unlocking Value,” deeply emphasizing technological innovation and industrial chain collaborative upgrades. It attracts participation from dozens of global leading companies, including Samsung Electronics, Phison Electronics, Kioxia, Solidigm, Intel, and Tencent Cloud, covering all aspects of the storage chip supply chain, including original manufacturers, controller design, module manufacturing, and cloud services. The summit will progress through high-end forums and technical exhibitions, addressing economic trend forecasts, focusing on the explosive demand for storage capacity driven by the surge in token/KV Cache in the Agent AI era, and engaging in cutting-edge discussions on PCIe 5.0/6.0 SSDs, breakthroughs in ultra-large capacity QLC technology, and other AI-driven storage innovations, while simultaneously showcasing over a hundred innovative products.

▍ The explosion in storage demand driven by AI inference, structural mismatches have become the norm, and it is expected that supply will remain short at least until 2027, with price increases persisting throughout 2026.

Demand side: According to CFM data on the China flash memory market, server shipments in 2026 are expected to increase by 15% year-on-year, with AI servers accounting for over 20% of overall server shipments. As large models transition from the training phase to the inference phase, the explosion of Agent applications leads to a dramatic increase in Token consumption. When the sequence length increases from 1k to 128k tokens, KV Cache usage rises from 0.5GB to 64GB (BF/FP16, per request). Under long contexts and high concurrency, storage demand surges linearly with token/concurrency volume. CFM predicts HBM capacity to increase by over 90% in 2025 and over 35% in 2026. At the same time, the downtrend of KV Cache combined with the overflow demand caused by HDD supply shortages drives eSSD to become the largest downstream product of NAND in 2026 (with market share rising to 37%).

Supply side: The mismatch in expansion cycles means that shortages and price increases will continue for a long time. Storage manufacturers generally adopt strategies to stabilize prices, prioritizing advanced production capacities for high-margin AI storage products. According to CFM, the proportion of high-end DRAM capacities, including HBM/DDR5/LP5X/6, is expected to rise from less than 50% in 2024 to over 85% in 2026, continuously squeezing mature processes and consumer-grade capacities. Industry inventories are projected to decrease from 10-12 weeks in October 2023, to 8-10 weeks in 2024, and down to 4 weeks in 2026, dropping below historical safety lines. The storage expansion cycle lasts 18-24 months, and a supply turning point is unlikely to appear in H2 2026. Phison Electronics believes that 2027 will be the “darkest moment” for storage shortages. Starting in H2 2025, storage prices are expected to see an epic rise, with CFM predicting that DRAM and NAND ASP will continue to rise throughout 2026. In the AI inference era, storage capacity is core, leading to a long-cycle paradigm shift in storage, paving the way for super growth and non-cyclical rebounds.

▍ The storage industry chain accelerates value reconstruction.

At the recent GTC conference, Nvidia emphasized “Token Factory Economics,” highlighting the strategic position of storage in AI infrastructure, which also indicates that the profit ceiling for the storage industry will be opened long-term. According to CFM data, the ASP of eSSD products has already reached more than twice that of consumer-grade NAND ASP in Q1 2026. For storage manufacturers, the focus is on media upgrades and system architecture-level reconstruction, with forum presentations mainly focusing on the enterprise market. For storage solution vendors, the industry’s focus has shifted from “who is cheaper” to “who can get the goods.” Meanwhile, leading manufacturers like Phison Electronics are accelerating their transformation towards “customized high-value-added modules” empowered by self-developed controllers and expanding into enterprise SSDs to redefine storage value and break away from traditional reliance on low-cost inventory models.

▍ Trends in AI Cloud (Enterprise) Storage: The explosion of large-capacity QLC and rapid evolution of interfaces are reshaping computing power engines.

AI is accelerating its transition from the “training” phase to the “inference” phase, with future ratios of inference to training servers expected to reach as high as 10:1 to 50:1. Currently, due to storage bandwidth bottlenecks, the availability (utilization) of GPU clusters is only around 46% to 50%. Memory upgrades are a core demand, and many manufacturers at this summit shared functional redistribution for storage-computing collaboration. The role of eSSD is evolving from a “passive data container” to a core “computing power engine” and “extended memory layer”: on the training side, relying on ultra-large capacity QLC eSSD storage Checkpoints can significantly enhance GPU operating efficiency; on the inference side, eSSD utilizes layered caching KV Cache to manage large context state management, vector database queries, and model shard loading tasks. Test data shows that offloading KV caching to SSD can reduce the time to first token (TTFT) generation by 41 times. Enterprise-level storage is exhibiting the following technological trends:

In response to the overwhelming demand for massive AI data and KV Cache overflow, high-density QLC has become a key medium, with hundreds of TB-class ultra-large capacity QLC solutions becoming the preferred choice. Kioxia (245.76TB), Dapu Micro (245TB), and SanDisk (up to 256TB SN670 solution) have all showcased ultra-large capacity QLC products that break the two hundred TB level, significantly optimizing space efficiency and TCO.

Controller chips are moving towards “soft and hard collaboration” to fill media gaps. In response to the high-frequency random read/write and bandwidth pressure brought by KV Cache in inference scenarios, controller chips are actively upgrading. The Tsinghua Unigroup Zhenyue 510 natively supports the ZNS protocol and system-level collaboration, facilitating the large-scale commercial use of QLC, with cumulative shipments exceeding 500,000 units. Silicon Motion Technology is introducing KV acceleration engines, predictive prefetching, and other technologies, transforming controllers from “data movers” to proactive “intelligent resource schedulers.”

Rapid interface iterations and liquid cooling innovations are adapting to large GPU clusters with tens of thousands of cards. Facing the challenges of massive data throughput and high-density heat generation for clusters of thousands, tens of thousands, and even hundreds of thousands of cards, Samsung showcased the 16-channel PCIe 6.0 SSD PM1763, which has achieved a leap in input/output performance by 2.0 times. FADU’s PCIe Gen6 controller “Lhotse” has already been taped out, with sequential read performance expected to reach 28.5GB/s.

▍ Trends in AI End (Consumer) Storage: End-side AI acceleration is taking shape, and storage-computing integration is breaking through memory usage bottlenecks.

End-side environments are extremely demanding on hardware BOM costs, system power consumption, and DRAM memory usage. Therefore, through “storage-computing integration,” intelligent scheduling of software and hardware, and advanced caching technologies, the inference pressure is being shifted from memory (DRAM) to flash memory (NAND), becoming an important supplement to break through the deployment bottlenecks of large models on the end side.

AI PCs and local large models: Hybrid technology reduces DRAM capacity demand pressure. Running hundreds of billions or trillions of parameters on the end side poses a significant challenge for memory. Jiangbolong has introduced storage processing units equipped with 5nm SPUs and iSA storage intelligent agents, achieving local deployment of a 397B model on a PC host in collaborative tuning verification, while reducing DRAM usage by nearly 40% in a 256K context scenario. Phison Electronics has launched Phison Hybrid AI SSD and aiDAPTIV+ technology, which is expected to reduce DRAM usage by over 50%, achieving controllable and secure local inference.

Smart cars and edge computing: Moving towards a centralized pooling architecture and unified platform base. Embodied intelligence and advanced autonomous driving require global collaborative demands on underlying architectures. XPeng Motors has pointed out that under the current computing power of up to 2250 TOPS, DRAM bandwidth has become the core bottleneck for inference latency, signaling the imminent arrival of the automotive LPDDR6 era, with in-vehicle NAND storage transitioning from isolated domains to centralized pooling and software-defined.

Smartphones and AIoT: The deep integration of high-speed interfaces and advanced caching technologies. In response to the responsiveness and endurance requirements of mobile and emerging wearable devices, Phison is about to launch a new generation of UFS 4.1 controller SM 2755 and accelerate its layout in the AIoT market for smartwatches/glasses; SanDisk employs SmartSLC caching technology to achieve high throughput operation of UFS 4.1 with only about 2W of power consumption; Jiangbolong is promoting HLC advanced caching technology to land in embedded ends to reduce terminal BOM costs.

▍ Risk Factors:

Global macroeconomic downturn risks; downstream demand not meeting expectations; innovation not meeting expectations; changes in the international industrial environment and intensified trade frictions; computing power upgrade progress not meeting expectations; cloud vendors’ capital expenditures not meeting expectations, etc.

▍ Investment Strategy:

We are optimistic about the trend of storage-computing industries under the enhancement of storage capacity in the Agent AI era, with high prosperity in storage computing. We are optimistic about the HBM and CUBE industry chains; at the same time, under the storage shortage, mainstream to niche storage is fully experiencing shortages and price increases. Many manufacturers report that the price increase in Q2 2026 is still similar to the previous quarter, and we expect industry supply to remain short at least until the end of 2027. Core recommendations: storage module companies, which have strong short-term performance explosion capabilities; storage original manufacturers and design companies close to original manufacturers.

(Source: Jiemian News)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.