GPU shortage repeats: Microsoft and other cloud providers tighten supply, AI startups face 32% chip rental price increase and still have to wait until the end of the year

According to Beating Monitoring, several AI startups have reported that cloud providers such as Microsoft and Amazon are concentrating GPU compute power within internal teams and for major customers (OpenAI, Anthropic), while small and medium-sized customers face higher prices, long waits, and more stringent contract terms. Microsoft Azure’s sales management recently told employees that GPU waiting times for cloud customers are expected to continue until the end of 2026.

Specific case: Image generation startup Krea (funding $8300 million, with investors including Andreessen and Bain Capital Ventures) rented hundreds of Blackwell chips at $2.80 per card per hour six months ago under a 6-month contract. During renewal, multiple cloud providers did not answer calls, and the deal was ultimately closed at $3.70 per hour—an increase of 32%—with the contract term extended to 1 year. CEO Victor Perez said that some providers simply do not respond, while others only agree to discuss matters after offering a three-year contract. The CEO of GPU cloud provider Lightning AI, Will Falcon, revealed that the company has 40,000 GPUs online, but total demand from about 40 queued customers is 400,000 GPUs, and rental prices have risen by more than 25% within six months.

Microsoft applies tiered management to GPU access: about 1,000 of the largest customers (Tier 1) are prioritized. For small customers that want to rent Blackwell chips, they must commit to renting at least 1,000 units for at least one year, with contracts starting at tens of millions of dollars. For customers on a pay-as-you-go basis, if they leave GPUs idle for a few hours, Microsoft may directly revoke access. Startups participating in the free quota program “Microsoft for Startups” have also been told that if they do not use the resources sufficiently, their GPU permissions will be withdrawn.

Hemant Taneja, a partner at General Catalyst, has sent questionnaires to the companies it has invested in to ask about compute bottlenecks; the firm is planning to either share a compute pool or collectively negotiate pricing through investment firms. Some startups have begun considering buying GPUs themselves: Collide, an AI startup in the oil industry, plans to spend about $50 million to directly purchase NVIDIA GPUs, rent data center space, and run their own operations to avoid queues and uncertainty.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin