NVIDIA releases Blackwell cost details: GPUs are twice as expensive, and each token is 35 times cheaper

robot
Abstract generation in progress

CryptoWorld News reports that NVIDIA has released a cost breakdown for its Blackwell series. It points out that GPU costs are twice as high as the previous generation, but the cost per token is 35 times lower. According to NVIDIA’s blog, when assessing inference infrastructure, you should focus on “cost per token” rather than “cost per GPU per hour.” Using deepseek-r1 (MOE inference model) as the test subject, Blackwell (GB300 NVL72) is compared with the previous generation Hopper (HGX H200). Based on cloud-market rental reference prices, Blackwell costs $2.65 per GPU per hour, nearly double Hopper’s $1.41, but the token output per GPU per second jumps from 90 to 6000—an increase of 65x. After allocation, the cost per million tokens drops from $4.20 to $0.12. It should be noted that the $0.12 cost is calculated on the premise that multiple software optimizations, such as FP4 low-precision inference and multi-token prediction, are enabled.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments