AI computing costs are rising steadily, and GPU prices fluctuate "like oil" with supply and demand.

robot
Abstract generation in progress

AI infrastructure costs are going through a round of intense volatility, and the unpredictability of GPU server prices has become a core challenge faced by cloud service providers and AI developers.

According to The Information, driven by tight supply of memory chips and other key components, prices of Nvidia AI servers have continued to climb over the past several months, with the cost of some components fluctuating by as much as 40% in a single week. This situation has forced multiple cloud service providers to raise rental prices for AI developers one after another. GPU cloud service provider Nebius increased its on-demand compute rental prices by approximately 30% on June 1; Amazon AWS subsequently announced that the prices of its EC2 capacity blocks would rise by roughly 20% starting July 1.

The sharp price fluctuations are reshaping the cost structure of the entire AI computing capacity market. Carmen Li, CEO of price data provider Silicon Data, said that the GPU rental prices cloud service providers charge customers have begun to show supply-and-demand characteristics driven in a manner similar to commodity markets such as oil. Small and medium-sized customers renting on-demand computing power are the first to feel the impact, and the lack of transparency in market pricing mechanisms further worsens buyers’ information disadvantage.

Component costs are fluctuating sharply, and the server pricing window is extremely narrow

The instability in GPU server prices is rooted in extreme tightness along the upstream component supply chain.

According to a person who sells Nvidia servers to cloud service providers, the cost of components required for server racks can fluctuate by up to 40% within a week, involving input wafers manufactured by TSMC, co-packaging, networking, cooling, and—most notably—memory components. The source said bluntly that GPU server rack prices “fluctuate very wildly,” that “everything can change completely within two or three weeks,” that it is “impossible to predict price trends,” and that prices can only be locked in within an extremely short window, making longer-term cost planning impossible.

An executive at a GPU cloud service provider said that the server racks they procure have recently been increasing in price by about 2% to 3% per week. Another executive at a competing firm pointed out that the NVMe storage drives in Nvidia Grace Blackwell 300 racks are the main source of price volatility. A few months ago, the fluctuations were “very intense.” Currently, rack costs are 10% to 15% higher than what they consider the “baseline price.” The upward momentum for GB300 racks now appears to be stabilizing, with monthly increases of about 1%.

The impact of price fluctuations is sharply magnified because the absolute amounts involved are enormous. Just one rack filled with Grace Blackwell 300 chip system(s) is priced at $70,000 per chip system. A total of 72 fully equipped racks comes to approximately $5 million, and some customers purchase as many as several thousand units at once. An executive from a customer currently procuring Vera Rubin racks said that the expected price for racks of that model is approximately $7 million.

Pricing power cascades down the supply chain step by step, with Nvidia and memory manufacturers holding the initiative

Behind this cost increase is a high concentration of pricing power across each link in the supply chain.

The aforementioned server salesperson said Nvidia “can almost demand any price.” An Nvidia spokesperson responded that pricing depends on the component costs of server racks, and the company collaborates with server providers on pricing; prices may differ among different providers. Data shows that Nvidia’s gross margin has increased by 15 to 20 percentage points over the past few years, confirming its strong market pricing ability.

At the same time, memory chip manufacturers represented by Micron are exerting similar pricing pressure on Nvidia and other customers, driving up prices across the board—from Apple Mac to Nvidia GPUs.

Carmen Li noted that once chips leave Nvidia, the prices that cloud service providers charge for renting them out to customers start to follow the supply-and-demand logic of commodity markets. Her data shows that the rental price for Blackwell 200 chips has risen by about 20% since the beginning of this year. After cumulative increases of more than 20% in the past year for the rental prices of older Nvidia chips, the rental prices have basically leveled off over the past 30 days.

Small and medium-sized customers face the heaviest pressure, and structural gaps exist in market pricing transparency

In this round of price increases, customers renting on-demand computing power are in the most vulnerable position.

Cloud service providers are testing the pricing upper limit under the current tight GPU supply environment, or they may tilt server resources toward large customers, resulting in a reduction of the computing power resources available to small and medium-sized customers. However, the price trend is not one-directional. An executive at an AI model developer said that after prices doubled in the preceding one to two months, prices actually declined over the past two weeks. This divergence reflects that the GPU cloud service market is still in a relatively early stage, and with the surge in the number of GPU cloud service providers, the market landscape has not yet solidified.

The lack of pricing transparency further exacerbates buyers’ uncertainty. GPU cloud service providers typically do not publicly disclose actual prices, which effectively means that pricing power lies with the service providers rather than the customers.

An investor in a GPU cloud service provider expressed concern: “For our core customers, there is a tipping point—once they can’t make the numbers work economically, their business becomes hard to sustain, and we absolutely do not want to cross that red line.” This statement reveals that the continued rise in computing costs will ultimately impose substantive constraints on the commercial viability of the AI application layer.

Risk warning and disclaimer

        The market has risk; investment is therefore should be done with caution. This article does not constitute personal investment advice, nor does it consider the special investment objectives, financial conditions, or needs of any individual users. Users should consider whether any opinions, views, or conclusions in this article are compatible with their specific circumstances. Any investment made based on this article is at the user’s own risk.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned