Computing power is in high demand! The leasing cost of NVIDIA's H100, released four years ago, has surged nearly 40% in the past six months.

robot
Abstract generation in progress

With the new year, AI giants such as Anthropic and ByteDance have continued to roll out breakout applications. Combined with the “hothouse for ‘Crayfish’” trend that has boosted demand for calling open-source large models, Nvidia’s H100 chips have reached a spotlight moment: a “V-shaped” reversal in their price in the leasing market.

You should know that this chip was introduced by Jensen Huang at GTC in March 2022, and began shipping in the fall of the same year.

According to the “H100 one-year leasing contract price index” released by semiconductor research firm SemiAnalysis on Thursday, after this “old chip’s” leasing contract price hit $1.7 per hour starting in October 2025, it has surged to $2.35 per hour per GPU this March—up nearly 40%.

(Source: SemiAnalysis)

The index is constructed based on direct survey data of more than 100 cloud service providers and buyers and sellers of compute resources, collected once per month.

The latest report states that on-demand rented GPU compute capacity has already sold out across all types of GPUs. Even if prices have risen recently, customers who have already locked in on-demand instances are unwilling to release that compute capacity back to the resource pool. The institution also draws an analogy: finding GPU compute capacity in early 2026 is like trying to book tickets for the “last flight leaving”—the price is steep, and there are almost no available resources.

The researchers added: “Customers are rushing to buy Amazon Cloud Services’ p6-b200 spot instances at a price of $14 per GPU per hour. Some emerging cloud service giants (Neocloud Giants) have even stopped offering single-node sales; some Nvidia H100 GPUs are still being renewed at the original prices they were signed at 2–3 years ago, and some H100 contracts have even been renewed directly through 2028.”

As for Blackwell chips with a more advanced architecture, researchers said that due to strong demand for open-weight models and the ongoing surge in inference demand, the delivery cycle for newly deploying Blackwell has now been extended to 6 to 7 months.

Later in 2025, the market once expected that with faster deployment of Blackwell chips—stronger performance and lower compute costs—the leasing prices of Hopper chips (H100, H200) could see a significant drop. But the latest situation is exactly the opposite: now, demand for H100 not only remains resilient, but in many cases is even strengthening.

SemiAnalysis noted in the report that one of the key drivers of compute demand this year comes from native media generation. For example, during ByteDance’s Seedance (Dream) and Google’s Nano Banana driving users to generate and optimize large volumes of video/images, token throughput rises sharply. Even more notably, demand is being driven by the rise of multi-agent workloads, pushing token usage and compute consumption to grow in a parabolic pattern.

SemiAnalysis said that just on their own, they “consumed billions of tokens in the past week,” with cost per one million tokens around $5. However, the company is also satisfied, saying that the returns far exceed the compute costs, thanks to time saved and the expansion of workflows and capabilities.

The report also points out that the dynamic of tighter compute supply and rising prices is disconnected from broad market sentiment. The stock prices of emerging cloud service providers such as CoreWeave and Nebius are at the low end of their past 6 to 12 month ranges. Analysts indicate that the market is still anchored in the narrative framework that “supply will eventually overshoot and compute will become commoditized.” But the reality is that under an aggressive tight-supply environment, almost all types of compute resources will retain strong demand—regardless of their relative performance differences.

Looking ahead, researchers provide three key observations to determine whether GPU leasing prices will remain elevated.

First, as GB300 clusters roll out gradually throughout 2026, the market will focus on whether new supply can actually ease the current compute tightness. Second, it is necessary to watch whether ongoing chip shortages will further worsen. Finally, it’s also important to observe how major AI giants expand their annual recurring revenue (ARR), along with the pace of AI app adoption and the continued growth rhythm of the scale of token consumption.

(Source: Caixin Global)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin