A bubble triggered by NVIDIA is about to burst.

Source: AGI interface

In mid-May, during the 90-day window period of suspended tariffs, a fierce competition for core resources related to computing power suddenly heated up.

"The price of servers has fluctuated drastically, with prices having risen by 15%-20% for each unit recently. With the suspension of tariffs, we plan to resume sales at the original price," a chip supplier from a southern region revealed to Huxiu.

At the same time, the market supply side has also welcomed new variables. According to exclusive information from Tiger Sniff, Nvidia's Hooper series high-end products and Blackwell series have quietly appeared in the domestic market, with the former emerging around September 2024, while the latter occurred recently. A senior executive from Huari Intelligent Computing stated, "Different suppliers have different channels for sourcing goods." The complex supply chain network behind this is difficult to trace.

(Huxiu Note: Starting from October 17, 2023, Washington has phased out the sale of chips from Nvidia to China, including A100, A800, H800, H100, and H200; recently, the last chip in the Hooper series that could be sold to China, the H20, has also been included in the export restriction list.)

Among them, NVIDIA's Hooper high-end series typically refers to the H200, which is an upgraded version of the H100 chip. The former is only priced about two hundred thousand higher than the latter, but its efficiency is 30% higher. The Blackwell series belongs to NVIDIA's high-end lineup, with the B200 priced at over three million, making it currently the product with the most severe "restricted circulation," and its circulation path is more secretive. Both of these models are used for large model pre-training, and the B200 is especially "hard to obtain."

Looking back at the timeline, in April 2024, a photo of Jensen Huang with Ultraman (Sam Altman) and OpenAI co-founder Greg Brockman circulated on Twitter. Behind this photo is the key delivery milestone for the first batch of H200 products – NVIDIA CEO Jensen Huang personally delivered the goods, and OpenAI was the first batch of users for the H200.

Just 5 months later, news of the H200 supply has arrived from across the ocean. Now, domestic suppliers have the capacity to supply 100 H200 servers per week. According to the supplier, as the H100 is discontinued, market demand is rapidly shifting to the H200, and currently, there are no more than ten suppliers with access to H200 supply, further widening the gap between supply and demand.

"What the market is most lacking now is H200, and as far as I know, a cloud factory is recently looking for H200 everywhere." A veteran player with 18 years in the computing power industry told Huxiu that they have been providing computing power services for Baidu, Alibaba, Tencent, and ByteDance for a long time.

In this computing power arms race, the transaction chain is shrouded in mystery. According to a leading domestic computing power supplier, the industry's prevailing computing power pricing rule is that only the computing power unit "P" is marked in the contract, which converts the server transaction into an abstract computing power transaction. For example, when the computing power user and the computing power supplier conduct a computing power transaction, the card model will not be directly written into the contract, but the computing power of how much P will be used instead, that is to say, the specific card model will not be written in the bright side.

Deep within the industrial chain, a hidden trading network has emerged. Previously, media reports revealed that some Chinese distributors achieved "roundabout listings" of servers through special procurement channels, multi-layer resale, and packaging. Furthermore, Tiger Sniff has learned that some distributors have taken alternative routes, using third-party companies to acquire servers by embedding modules into products.

Behind the turbulent undercurrents of the industry chain, the development of the domestic computing power industry is also showing new trends.

01 Where does the smart computing bubble come from?

At the end of 2023, the "Nvidia Ban" from across the ocean hit the calm surface of the lake like a giant stone, and a covert battle over the core resources of computing power was ignited.

In the initial months, the market exhibited a primitive chaos and restlessness. Under the temptation of huge profits, some keen individuals began to take risks. "At that time, the market was filled with 'suppliers' from various backgrounds, including overseas returnees and some well-informed individuals acting as middlemen," recalled an industry insider who wished to remain anonymous. "Their circulation methods were relatively simple and crude; although the transactions were still secretive, they had not yet formed the complex layers of subcontracting that emerged later."

These early "pioneers" leveraged information asymmetry and various informal channels to supply high-end Nvidia graphics cards to the market. As a result, the prices of graphics cards naturally soared. According to some media reports, among them, some individual suppliers even priced the Nvidia A100 graphics card at 128,000 RMB, far exceeding its official suggested retail price of about 10,000 USD. Even more exaggerated, someone on a social media platform was seen holding an H100 chip, claiming its single piece price reached 250,000 RMB. At that time, the aforementioned behaviors and postures could be described as almost boastful.

Under this secret circulation, some large computing power suppliers have begun to establish similar trading network channels, and the resulting surge in intelligent computing has also emerged during the same period. Between 2022 and 2024, many places are rushing to build intelligent computing centers. Data shows that in 2024 alone, there will be more than 458 intelligent computing center projects.

However, this fervent "card speculation and intelligent computing craze" did not last long. By the end of 2024, especially with the emergence of domestic large models like DeepSeek, which offered high cost performance, some computing power providers that relied solely on "card hoarding" or lacked core technological support found it increasingly difficult to tell their stories. The bubble of intelligent computing also began to show signs of rupture.

According to data statistics, in the first quarter of 2025, there were 165 new developments in intelligent computing center projects in mainland China, of which as many as 58% (95 projects) are still in approved or preparation status, another 33% (54 projects) are in construction or about to be put into production status, while only a mere 16 projects have actually been put into production or trial operation, accounting for less than 10%.

Of course, it's not just the domestic market showing signs of a bubble burst. In the past six months, companies like Meta and Microsoft have reported pausing some global data center projects. The other side of the bubble is the concerning inefficiency and idleness.

Industry insiders told Huxiu, "Currently, the activation rate of intelligent computing centers is less than 50%. Domestic chips cannot be used for pre-training due to performance shortcomings. Moreover, some intelligent computing centers are using relatively outdated servers."

This phenomenon of "having cards but not being able to use them" is attributed by industry professionals to "structural mismatch"—it is not that there is an absolute surplus of computing power, but rather a lack of effective computing power supply that can meet high-end demand, while a large amount of established computing resources are unable to be fully utilized due to technological obsolescence, an incomplete ecosystem, or insufficient operational capabilities.

However, in the landscape of intelligent computing where noise and concerns coexist, technology giants exhibit a completely different posture.

According to reports, ByteDance plans to invest more than $12.3 billion (approximately 89.2 billion yuan) in AI infrastructure by 2025, with a budget of 40 billion yuan allocated for purchasing AI chips in China and another approximately 50 billion yuan planned for buying Nvidia chips. In response, ByteDance stated to Huxiu that the news is inaccurate.

Alibaba has also made significant investments in AI. CEO Wu Yongming publicly announced on February 24 that Alibaba plans to invest 380 billion yuan in building AI infrastructure over the next three years. This figure even exceeds the total of the past decade.

But in the face of large-scale purchases, the pressure on the supply side is also becoming apparent. "The market's supply can't keep up with the large manufacturers, and many companies are unable to deliver even after signing contracts," a sales representative from a smart computing supplier told Huxiu.

In contrast, the aforementioned AI computing bubble sharply contrasts with the current major companies' significant investments in AI infrastructure: on one side, power supply companies led by the A-shares are halting large-scale AI computing projects, while on the other side, major companies are actively investing in AI infrastructure.

The reason behind this is not hard to understand. The sharp decline in intelligent computing coincided with the period around DeepSeek. Since this year, no one has mentioned the concept of the "Hundred Models Battle" anymore; DeepSeek has burst the bubble of training demand. Now, only the big companies and a few AI model companies remain at the table.

In this regard, Feng Bo, managing partner of Changlei Capital, also told Huxiu, "When training is not flourishing, those who truly have the ability and qualifications to train will continue to buy cards for training, such as Alibaba and ByteDance, while those who lack the ability to train will disperse when the performance ends, and the computing power in their hands will become a bubble."

02 The computing power that has been withdrawn from the lease

The birth of any "bubble" is rooted in the irrational imagination of humans regarding scarcity. Those who speculate on Moutai and hoard computing power are not necessarily enthusiasts of Moutai or consumers of computing power; they all share a speculative mentality.

By the end of 2024 and the first quarter of 2025, multiple companies including Feili Xin, Lianhua Holdings, and Jinjic Co. have successively terminated computing power leasing contracts worth hundreds of millions. At the same time, a computing power supplier told Huxiu, "In the computing power leasing business, returning leased power is a common occurrence."

The enterprises terminating their leases are not true demand terminals for computing power. As the industry shock triggered by DeepSeek unfolds, the AI industry bubble gradually bursts, and many computing power suppliers are forced to confront the problem of excess computing power, searching for stable customers and exploring new paths for computing power absorption.

Huxiu discovered during its investigation that on the business card of a founder of a computing power supplier, in addition to three companies in the fields of intelligent computing and cloud computing, there was also prominently printed an investment company. Further digging revealed that the investment company's portfolio projects included a robotics company and a company focused on the research and development of large models and cloud systems. The founder revealed to Huxiu, "The total computing power demand of these two invested companies is met by their own computing power supply system; moreover, the invested companies usually purchase the computing power supplied by themselves at market low prices."

In fact, in the intelligent computing industry, forms like the binding of intelligent computing + investment are by no means unique. For many computing power suppliers, "this is currently a very effective way to utilize consumption cards, but it hasn't been brought to the forefront." Feng Bo stated to Huxiu.

However, in the story above, this is a "monopoly" type of computing power consumption path, where computing power suppliers lock in computing power demand through investment and directly meet all the computing power needs of the invested projects. But this is not the only way.

Feng Bo believes that there is another model worth paying attention to, which is the "computing power supplier entering the industrial fund as an LP, constructing a closed-loop computing power demand chain model."

Specifically, this business model exhibits characteristics of capital linkage: computing power supplier A acts as a potential limited partner (LP) and reaches a cooperation intention with industrial fund B. In the investment landscape of fund B, AI application vendor C is the invested enterprise, and its business development has a rigid demand for computing power resources. At this time, A indirectly binds C's future computing power procurement needs by strategically investing in fund B, thereby constructing a closed loop of "capital investment - computing power procurement."

If the transaction is executed, Company A will obtain priority service rights with its LP identity and become the preferred supplier for Company C's computing power procurement. This model essentially creates a circular flow of funds - Company A's investment in Fund B ultimately flows back through Company C's computing power procurement.

"This is not a mainstream method, but it is a relatively good way to use." Feng Bo admitted.

The bubble is about to burst, then what?

"When discussing the bubble of intelligent computing, we cannot only talk about computing power; it is an issue of the entire industrial chain. To make computing power usable, we need to connect the broken points. Currently, this industrial chain has not yet formed a closed loop." A chief marketing officer of a computing power supplier who has been deeply involved in the industry for many years pointed out the core issue of the current intelligent computing industry to Tiger Sniff.

As we enter the first half of 2025, a significant trend in the AI field is that the term "pre-training," which was once frequently mentioned by major AI companies, is gradually being replaced by "inference" in terms of popularity. Whether targeting the vast C-end consumer market or empowering various B-end enterprise applications, the growth curve of inference demand appears to be exceptionally steep.

"Let's make a simple deduction," an industry analyst estimated. "Based on the scale of mainstream AI applications currently on the market, such as Doubao and DeepSeek, if we assume each active user generates an average of 10 images per day, the computing power demand behind this could easily reach the million P level. This is just for the single scenario of image generation; if we add in text, voice, video, and other multimodal interactions, the demand scale is even more difficult to quantify."

This is just the reasoning demand for C-end users. For B-end users, the reasoning demand is even larger. A senior executive from Huari Zhikuan told Huxiu that car manufacturers start building intelligent computing centers at a scale of tens of thousands of P, "and among our clients, besides large manufacturers, the ones with the highest computing power demand are car manufacturers."

However, when linking the vast demand for inference with the computing power bubble, the story appears exceptionally absurd. Why does such a large demand for inference still generate a computing power bubble?

A certain computing power supplier stated to Huxiu that such a massive demand for inference requires intelligent computing service providers to optimize computing power through engineering technology, such as compressing startup time, increasing storage capacity, reducing inference latency, improving throughput, and enhancing inference accuracy, among others.

Moreover, the issue of supply and demand mismatch mentioned above also largely stems from chip problems. In this regard, industry insiders have indicated to Huxiu that there is still a significant gap between some domestic cards and Nvidia. Their own performance development is uneven, and even if the same brand piles up more cards, shortcomings still exist. This leads to a single cluster being unable to effectively complete AI training and inference.

This "bottleneck effect" means that even if computing power clusters are constructed by stacking chips on a large scale, if the bottleneck issue is not effectively resolved, the overall performance of the cluster will still be limited, making it difficult to efficiently support the complex training and large-scale inference tasks of AI large models.

In fact, while the engineering challenges at the computing power level and the chip bottlenecks are indeed severe, many deep-seated computing power demands have not been effectively met. The real "breakpoint" often occurs in the application ecosystem above the computing power layer, especially in the serious gap of vertical models at the L2 layer (i.e., tailored for specific industries or scenarios).

There is such a huge "hole" that needs to be filled in the medical industry, and the talent siphon effect is a structural problem that has been criticized for a long time in the domestic medical system, and excellent doctors are concentrated in the tertiary hospitals in first-tier cities. However, when the industry hopes that the medical model will realize the sinking of high-quality medical resources, a more fundamental challenge emerges: how to build a trusted medical data space?

In order to train a large model with the ability to diagnose and treat throughout the entire disease course, data is the key prerequisite. However, the problem is that massive amounts of data covering the entire disease course, all age groups, all genders, and all regions are required to form knowledge within the large model. The reality is that the open rate of medical data is less than 5%.

The director of the information department of a certain top-tier hospital revealed that among the 500TB of clinical data generated by the hospital each year, less than 3% of it is truly usable for AI training as desensitized structured data. More severely, data on rare diseases and chronic diseases, which account for 80% of the value of the disease map, has long been dormant in the "data islands" of various medical institutions due to its sensitivity.

If such breakpoints cannot be resolved, the industrial chain cannot form a closed loop. The demand for computing power naturally cannot be met, and it is clear that this has far exceeded the scope that traditional computing infrastructure providers, who merely provide "cards and electricity," can independently handle.

However, a new batch of intelligent computing service providers is quietly emerging in the market today. These companies no longer limit their positioning to merely providing hardware or renting computing power; they can also assemble professional algorithm teams and industry expert teams to deeply participate in the development and optimization of AI applications for their clients.

At the same time, in response to various issues such as resource misallocation and computing power utilization, different regions have actually introduced a variety of computing power subsidy policies based on local industrial needs. Among them, "computing power vouchers" serve as a direct subsidy method to reduce the cost of computing power for enterprises. However, for the current stage of China's intelligent computing industry, a simple policy "emergency remedy" may be difficult to fundamentally change the situation.

Today, what the intelligent computing industry needs is a "self-sustaining" cultivation ecosystem.

View Original
The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments