Core point: Reasoning infrastructure should be measured by cost per token, not by GPU hours. Taking DeepSeek-R1 as an example, Blackwell (GB300 NVL72) is more expensive per GPU hour than Hopper, but throughput is significantly higher, reducing the cost per token from about $4.20 to about $0.12, an efficiency increase of approximately 50–65 times. If certain optimizations are not enabled, the cost is about $2.35 per million tokens; with optimizations enabled, it drops to about $0.11–$0.12. This conclusion applies only to this model; other models have different scale figures.

BlockBeatNews

2026-04-30 04:20:49

Abstract generation in progress

According to Beating Monitoring, NVIDIA published a blog analyzing hardware selection for inference, with the core argument: evaluating inference infrastructure should focus on “cost per token” rather than “cost per GPU per hour.” Using GPU unit price, Blackwell is more expensive; using token cost, Blackwell far surpasses the previous generation.

The blog uses DeepSeek-R1 (MoE inference model) as the test subject, comparing Blackwell (GB300 NVL72) with the previous Hopper (HGX H200). Based on cloud market rental reference prices, Blackwell costs $2.65 per GPU per hour, nearly double Hopper’s $1.41, but the token output per GPU per second jumps from 90 to 6,000. The 65-fold throughput increase spreads out, reducing the cost per million tokens from $4.20 to $0.12. The token output per megawatt increases by 50 times.

Preconditions to note: the $0.12 figure is based on all software optimizations being enabled, including FP4 low-precision inference and MTP (multi-token prediction, allowing the model to generate multiple tokens at once to speed up). SemiAnalysis InferenceX v2 raw data shows that running DeepSeek-R1 on GB300 NVL72 without MTP costs about $2.35 per million tokens; with MTP enabled, it drops to about $0.11, a 21-fold difference just from this optimization. All figures are from tests of the DeepSeek-R1 single model; different model architectures and scales will produce different numbers.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
WCTCTradingKingPK
385.78K Popularity
#
#FedHoldsRateButDividesDeepen
11.82K Popularity
#
DailyPolymarketHotspot
713.17K Popularity
#
BitcoinSpotVolumeNewLow
162.66M Popularity
#
OilBreaks110
856.37K Popularity

Sitemap

NVIDIA releases Blackwell cost details: GPUs are twice as expensive, and each token is 35 times cheaper in return

Trending Topics

WCTCTradingKingPK

#FedHoldsRateButDividesDeepen

DailyPolymarketHotspot

BitcoinSpotVolumeNewLow

OilBreaks110

Pin