Why can AI data centers not rely solely on GPUs? Analysis of the Synergy Mechanism of Memory, Network, and Storage

Question

In June 2026, Bitcoin is hovering around the $60,000 mark, Ethereum is fluctuating in the $1,600 range, and the crypto market is in a phased bottoming phase. But another track—AI data center infrastructure—is showing a distinctly different level of heat. Gartner predicts that global IT spending in 2026 will reach $6.31 trillion, a year-over-year increase of 13.5%, with data center system spending leading all categories at a growth rate of 55.8%. IDC, meanwhile, expects global enterprise spending on AI in 2026 to reach $940 billion.

In this computing power arms race, a key cognitive shift is taking place: the competitiveness of AI data centers no longer depends solely on the number of GPUs and peak computing power, but increasingly on the overall synergy of computing, storage, and networking within the cluster. Understanding how Memory, Networking, and Storage work together has become a fundamental skill for evaluating the investment value of AI infrastructure.

Memory Wall: The First Bottleneck of the Large Model Era

The parameter scale of AI large models has experienced exponential growth over the past two years. From 2024 to 2026, the parameter count of mainstream large models has surged a hundredfold, and the context window has expanded from tens of thousands of words to millions of words. However, server memory bandwidth has increased by less than 15% annually, far lagging behind the growth rate of AI business. This severe mismatch in the iteration speeds of hardware and software has made the "memory wall" a core bottleneck restricting the release of AI computing power.

The so-called memory wall essentially means that the improvement speed of CPU/GPU computing power far exceeds the improvement speed of memory read/write bandwidth and latency. Computing chips operate extremely fast, but data access cannot keep up, causing processors to spend a significant amount of time idle and waiting. According to industry test reports, in a ten-thousand-card cluster, data I/O bottlenecks can cause GPU idle time to account for over 40%—meaning that expensive computing chips spend nearly half their time waiting for data movement.

The scarcity of memory resources is equally alarming. The DRAM and HBM consumption of a single AI inference server is more than ten times that of a traditional data center server, and nearly 60% of global DRAM wafer capacity has been occupied by AI clusters. HBM has been in a state of locked orders and shortages for a long time, with major production capacity pre-committed by large customers through 2026 or even 2027. Gartner points out that strong demand, combined with supply bottlenecks, has pushed HBM prices to historic highs, and the rapid price increase has made memory a high-profit area for semiconductor manufacturers.

To solve the memory wall, the industry is advancing along two paths: one is fine-grained scheduling and compression optimization at the software level, revitalizing existing storage resources through technologies such as KV cache hierarchical scheduling and low-bit quantization compression; the other is architectural reconstruction at the hardware level, including HBM iterations and the implementation of new memory interconnection protocols like CXL (Compute Express Link). NVIDIA's new HGX Rubin platform has increased GPU memory bandwidth by three times to 176 TB/s. These two paths are not substitutes for each other but are complementary solutions for the entire industry chain to collaboratively reshape the logic of storage and computing power collaboration.

Networking: The "Neural Network" of AI Clusters

If memory addresses the data movement efficiency within a single node, networking solves the problem of data flow between nodes. In large-scale AI clusters, hundreds or thousands of GPUs need to work together to complete a model training or inference task, and the communication efficiency between GPUs directly affects the overall training speed.

The current bandwidth bottlenecks are multi-layered: between chips, traditional PCB board-level interconnects can no longer meet the high bandwidth and low latency requirements of AI chips; within cabinets, the interconnect bandwidth between servers becomes a constraint on vertical scaling; between data centers, bandwidth and latency over long-distance transmission limit horizontal scaling and cross-regional computing power scheduling efficiency. According to estimates, in current AI training clusters, the energy consumption of data movement has already exceeded the energy consumption of computation itself.

NVIDIA's NVLink and InfiniBand have long dominated the internal interconnect market for AI clusters. Its latest NVLink Switch bandwidth has reached 28.8 TB/s, a 2x improvement over the previous generation. However, this landscape is being challenged—AMD, Broadcom, and other vendors are all promoting their own interconnect solutions, and open standards like UALink (Ultra Accelerator Link) are also accelerating. By 2026, the networking track has shifted from "NVIDIA exclusivity" to "multi-standard competition," placing higher demands on data center operators' system integration capabilities.

Storage: From "Warehouse" to "Data Pipeline"

In traditional data centers, storage plays the role of a "data warehouse"—mainly used for storing and archiving cold data. But in AI data centers, the role of storage has been upgraded to a "data pipeline"—needing to continuously deliver training data to compute nodes at extremely high speeds and support low-latency model parameter reads in inference scenarios.

AI training requires high-speed reading of massive amounts of raw data, while inference requires fast access to model weights and KV caches. KV caches have begun to extend from GPU HBM down to system DRAM, and even further to local high-speed SSDs. This means the boundary between storage and memory is blurring; storage devices are no longer just endpoints for data but key nodes in the data flow pipeline.

All-flash storage is replacing traditional mechanical hard drives as the mainstream choice for AI data centers. The all-flash storage and native high-speed interconnect network products showcased by Sugon at ISC High Performance 2026 are an industrial footnote to this trend. The performance of storage directly determines whether data can be delivered to computing units in time, thereby determining GPU utilization.

"Compute-Storage-Network" Synergy: From Breakthroughs to System Optimization

After understanding the respective roles and bottlenecks of the three, the meaning of "synergy" becomes clear: the true computing power of an AI data center is not a simple sum of GPU computing power, memory bandwidth, network throughput, and storage IOPS, but the effective output of these four elements coupled at the system level.

The continuous growth of large model parameters has given rise to super AI clusters. Whether computing power performs well no longer depends solely on chip performance, but increasingly on the overall coordination capability and efficiency of computing, storage, and networking within the cluster. This judgment is becoming an industry consensus.

From an industry practice perspective, the "tight coupling" design of compute, storage, and network has become a standard approach among leading vendors. Sugon's scaleX AI Super Cluster adheres to the tightly coupled design philosophy of compute, storage, and network, significantly improving training and inference efficiency. NVIDIA's Dynamo 1.0 inference operating system, paired with the BlueField-4 CMX platform, connects multiple layers including GPU, HBM, host DRAM, local flash, and remote storage, breaking the single-card memory island through automatic hot and cold data separation.

In its June 2026 report, IDC clearly stated that the competitive advantage in AI has shifted: the key is no longer having the strongest computing power, but how to transform AI into sustainable business capability at the lowest Token cost. And the core components of Token cost are the combined efficiency of computing, memory, networking, and storage.

Market Landscape: Who Is Benefiting?

This industry trend has been fully reflected in the capital markets.

On the memory side, SK Hynix is undoubtedly the most dazzling target in 2026. On June 22, 2026, SK Hynix's stock price surged 6%, hitting an all-time high of 2,944,000 Korean Won, surpassing Samsung to become the largest market cap in the Korean stock market, with year-to-date gains exceeding 349%. Micron also performed strongly, reporting quarterly earnings in the last week of June with revenue more than quadrupling year-over-year and announcing 16 long-term supply agreements. Micron's stock surged 16% on the day of the earnings release.

On the networking side, fiber optic product supplier Corning's stock hit an all-time high in the last week of June. Its critical role of fiber optic products in AI data centers is being repriced by the market. Cisco's AI infrastructure orders have exceeded $9 billion.

On the server and system integration side, Dell's AI-optimized server quarterly revenue reached $16.1 billion, a 757% year-over-year increase. Supermicro holds approximately 70% market share in direct liquid cooling technology.

On the data center operations side, BOCOM International has listed GDS-SW and SUNEVISION as top buy targets in the data center sector, believing that generative AI has ignited explosive demand growth. UBS also noted that China's internet data center industry will significantly accelerate from the second half of 2026.

How to Participate in AI Infrastructure Investment Through Gate Platform?

Gate has listed over 12,500 stocks and ETFs in markets including US stocks, Hong Kong stocks, and Korean stocks. Investors can use a unified account to directly participate in global stock trading using digital assets like USDT, achieving unified allocation of crypto assets and traditional securities.

In the AI data center infrastructure sector, Gate covers the entire industrial chain from chips to applications:

For US stocks, investors can trade core companies such as NVIDIA (NVDA), AMD, Micron (MU), Broadcom (AVGO), Dell (DELL), Super Micro Computer (SMCI), Corning (GLW), and Cisco (CSCO). Gate supports pre-market and after-hours trading, extending trading hours to 16×5, allowing users to respond more promptly to corporate earnings and macroeconomic data.

For Hong Kong stocks, attention can be paid to data center operators such as GDS-SW (09698.HK) and SUNEVISION (01686.HK).

For Korean stocks, SK Hynix (000660.KS) is the absolute leader in the HBM field; Jeju Semiconductor plays a key upstream role in AI data center optical communication materials.

Gate stock trading supports a commission rate as low as 0.1%, leverage trading, and spot dual modes. Users with a position of $2,000 can enjoy VIP exclusive rates. For investors looking to systematically allocate to the AI data center infrastructure track, Gate's cross-market, multi-asset one-stop trading capability is lowering the barrier to global technology asset allocation.

Conclusion

AI data centers are moving from the extensive era of "piling up GPUs" to the refined era of "system optimization." Memory, networking, and storage are no longer isolated infrastructure components but system variables that jointly determine the true output of AI computing power under the "compute-storage-network" synergy framework.

Understanding this logic not only helps assess technology trends but also provides a more solid analytical framework for investment decisions—from chips to memory, from networking to storage, from servers to data center operations, the value reassessment of the entire industrial chain is just beginning. And as the short-term volatility of the crypto market intersects with the long-term narrative of AI infrastructure, a configuration window spanning digital assets and physical industries is opening.

FAQ

Q1: Why can't AI data centers solve computing power problems by simply piling up GPUs?

GPUs are just terminals for computing power output. Their performance heavily depends on whether memory bandwidth can supply data in time, whether networking can effectively coordinate multi-card parallelism, and whether storage can quickly respond to massive data reads and writes. In a ten-thousand-card cluster, data I/O bottlenecks can cause GPU idle time to exceed 40%—simply piling up GPUs without addressing the synergy of these three leads to massive wasted computing power.

Q2: Why is HBM so scarce?

HBM (High Bandwidth Memory) is the standard memory for AI chips. Its manufacturing process is complex, and capacity expansion cycles take over two years. In 2026, AI inference demand has surpassed training scenarios, further driving demand for HBM and large-capacity DRAM. Major production capacity has been pre-committed by large customers through 2026 or even 2027, making short-term supply elasticity extremely limited.

Q3: What is the core logic of AI data center infrastructure investment?

The core logic is shifting from "training-driven" to "full-stack demand explosion." The four tech giants—Microsoft, Google, Amazon, and Meta—are expected to have total combined AI infrastructure capital expenditure of $725 billion in 2026. This magnitude of funding cannot be borne by a single link; the entire industrial chain—from chips, memory, networking, to data center operations—is in a structurally benefiting cycle.

Q4: How can Gate platform trade AI data center related stocks?

Gate has listed over 12,500 stocks and ETFs in US, Hong Kong, and Korean markets. Users can deposit funds using digital assets like USDT and trade core AI infrastructure targets such as NVIDIA, Micron, and SK Hynix in a unified account. It supports pre-market and after-hours trading, leverage and spot dual modes, with fees as low as 0.1%.

Q5: What are the investment risks of AI data center infrastructure?

Main risks include: first, supply-demand mismatches may lead to periodic oversupply—BOCOM International points out the need to watch for possible periodic supply-demand mismatches and valuation fluctuations over a longer cycle; second, the sustainability of capital expenditure by hyperscale cloud service providers—Morgan Stanley notes that capital expenditure growth from 2025 to 2026 far exceeds actual revenue growth, putting pressure on cash flow; third, geopolitical factors and export controls disrupting the advanced process chip supply chain.

BTC-1.00%

ETH0.46%

View Original

Why can AI data centers not rely solely on GPUs? Analysis of the Synergy Mechanism of Memory, Network, and Storage

Memory Wall: The First Bottleneck of the Large Model Era

Networking: The "Neural Network" of AI Clusters

Storage: From "Warehouse" to "Data Pipeline"

"Compute-Storage-Network" Synergy: From Breakthroughs to System Optimization

Market Landscape: Who Is Benefiting?

How to Participate in AI Infrastructure Investment Through Gate Platform?

Conclusion

FAQ

Trending Topics

Get2SharesOfSKHynixAtZeroCost

GateCompletesDividendDistribution

PredictWorldCup🇫🇷vs🇸🇪

SolanaEcosystemANSEMSurges

StrategyBuybackSurges12%

Pinned