Computing power is in high demand! The leasing cost of NVIDIA's H100, released four years ago, has surged nearly 40% in the past six months.

robot
Abstract generation in progress

With the new year, AI powerhouses such as Anthropic and ByteDance have continued to roll out breakout applications, compounded by the “hyped-up lobster” wave driving a surge in calls to open-source large language models. Nvidia’s H100 chips have hit a bright moment of a “V-shaped” reversal in value on the leasing market.

You should know that this chip was launched by Huang Renxun at the March 2022 GTC, and began shipping in the fall of the same year.

According to the “H100 one-year leasing contract price index” released by semiconductor research firm SemiAnalysis on Thursday, after this “old chip’s” leasing contract price touched $1.7 per hour in October 2025, it has surged to $2.35 per hour per GPU this March—up nearly 40%.

(Source: SemiAnalysis)

The index is built from direct survey data collected monthly from more than 100 cloud service providers, as well as buyers and sellers of compute resources.

The latest report says that on-demand rented GPU compute is already sold out across all types of GPUs. Even with recent price increases, customers that have already locked in on-demand instances are unwilling to release that compute back into the resource pool. The firm also compares it to trying to book tickets for “the last flight leaving the airport” when looking for GPU compute in early 2026: prices are steep, and there are almost no available resources.

The researchers added: “Customers are rushing to buy Amazon Web Services’ p6-b200 bidding instances at a price of $14 per GPU per hour. Some emerging cloud service giants (Neocloud Giants) have even stopped offering single-node sales; some Nvidia H100 GPUs are still being renewed at their original prices from 2–3 years ago, and some H100 contracts are even extended directly to 2028.

As for Blackwell chips with a more advanced architecture, the researchers said that due to strong demand for open-weight models and the continuing surge in inference demand, the delivery cycle for newly deployed Blackwell is now extending to 6 to 7 months.

Later in 2025, the market once expected that with accelerated deployment of Blackwell chips that offer stronger performance and lower compute costs, the leasing prices of Hopper chips (H100, H200) might see a significant drop. But the latest situation is exactly the opposite: now the market’s demand for H100 is not only holding up strongly, and in many cases is even increasing.

In its report, SemiAnalysis noted that one of the key drivers of compute demand at the start of this year comes from native media generation. For example, during ByteDance’s Seedance (i.e., Meng) and Google’s Nano Banana, when users generate and optimize large volumes of videos/images, token throughput rises substantially. An even more notable source of demand is the rise of multi-agent workloads, driving token usage and compute consumption to grow in a parabolic pattern.

SemiAnalysis said that just on their own, they “consumed billions of tokens in the past week,” with a cost of roughly $5 per one million tokens. However, the company also expressed satisfaction that the returns from the time saved and the expansion of workflows and capabilities far outweigh the compute costs.

The report also points out that the dynamic of tightening compute supply and rising prices is disconnected from broad market sentiment. The stock prices of emerging cloud service providers such as CoreWeave and Nebius are at the low end of their past 6 to 12-month ranges. Analysts say the market is still anchored to the narrative framework that “supply will eventually overshoot and compute will become commoditized.” But in reality, under an aggressive supply-tight environment, nearly all types of compute resources will maintain strong demand—regardless of their relative performance differences.

Looking ahead, the researchers provided three key observations to determine whether GPU leasing prices will remain elevated.

First, as GB300 clusters gradually ramp up across all of 2026, the market will focus on whether new supply can actually ease the current compute tightness. Second, it’s necessary to watch whether the ongoing chip shortage further worsens. Finally, it’s also important to observe the expansion of the annual recurring revenue (ARR) of the major AI giants, as well as the pace at which AI applications are being adopted and the continuing growth rhythm of token consumption scale.

(Source: Caixin Global / Jiemian)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments