AIMPACT News, May 16 (UTC+8), Google disclosed the architecture details of the eighth-generation TPU (TPU 8t) rack-level connection to the Virgo network. The network uses high-radix switches and a flat two-layer non-blocking topology, increasing data center network bandwidth by four times compared to the previous generation, with a single structure capable of connecting over 134k TPU 8t chips, providing 47 Pb/s non-blocking bidirectional bandwidth and nearly linear scaling performance of over 1.7K ExaFlops. The TPU 8t itself adopts a 3D torus topology, with a single super pod scalable to 9,600 chips, and supports expansion to over one million chips via JAX and Pathways. Key technologies include SparseCore accelerators, VPU/MXU overlap and balanced scaling, native FP4 support, and integrated Arm-based Axion CPUs to eliminate host bottlenecks. This design addresses the evolution of AI models from dense large language models to large-scale mixture-of-experts models and inference-intensive architectures. (Source: InFoQ)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

8 Likes

Reward
8
11
3
Share

Comment

Add a comment

GateUser-8df0eb2b

· 6h ago

A single super pod with 9600 chips can be expanded to millions; such a scale would have been unimaginable last year.

View OriginalReply0

PerpNightwatch

· 8h ago

Native FP4 support, reducing VRAM and bandwidth pressure significantly, and inference costs have decreased.

View OriginalReply0

GateUser-14cb5f72

· 8h ago

1.7K ExaFlops near-linear scaling, this number looks like science fiction

View OriginalReply0

TheNemesisOfFomo

· 8h ago

The Pathways+JAX ecosystem is becoming more deeply integrated, with Google building its own moat.

View OriginalReply0

OpcodePoet

· 8h ago

High-radix switch flat topology, can data centers copy this design approach?

View OriginalReply0

ChillBlock

· 8h ago

Shifting from dense LLMs to MoE + reasoning architectures, the industry trend is changing.

View OriginalReply0

Don'tCallMeABagHolder.

· 8h ago

With the naming of TPU 8t, will the next generation be called 9t, 10t, directly aligning with NVIDIA's iteration pace?

View OriginalReply0

StardustUnderTheGlassDome

· 8h ago

Chip interconnection bandwidth increases fourfold, easing communication bottlenecks, and large model parallel efficiency can improve.

View OriginalReply0

RedGlass

· 8h ago

How do they handle the failure rate of a million-chip cluster? Curious about their fault tolerance mechanisms.

View OriginalReply0

ShortPositionsAtTheElevator

· 8h ago

SparseCore and VPU/MXU overlapping design is quite interesting; it seems to be paving the way for MoE architecture.

View OriginalReply0

Trending Topics
View More
#
StockTradingChallengeUpTo17000U
16.26M Popularity
#
TrumpBacksCFTCAuthorityOverPredictionMarkets
822.91K Popularity
#
GatePredictionMarketAddsSmartMoneyTracking
13.21M Popularity
#
MicronMarketCapBreaks1Trillion
39.97K Popularity
#
TradeCFDWinGold
3.08M Popularity

Pinned

Sitemap

Google releases the eighth-generation TPU 8t rack-scale network architecture details

Trending Topics

StockTradingChallengeUpTo17000U

TrumpBacksCFTCAuthorityOverPredictionMarkets

GatePredictionMarketAddsSmartMoneyTracking

MicronMarketCapBreaks1Trillion

TradeCFDWinGold

Pinned