Google releases the eighth-generation TPU 8t rack-scale network architecture details

robot
Abstract generation in progress
AIMPACT News, May 16 (UTC+8), Google disclosed the architecture details of the eighth-generation TPU (TPU 8t) rack-level connection to the Virgo network. The network uses high-radix switches and a flat two-layer non-blocking topology, increasing data center network bandwidth by four times compared to the previous generation, with a single structure capable of connecting over 134k TPU 8t chips, providing 47 Pb/s of non-blocking bidirectional bandwidth and nearly linear scaling performance of over 1.7K ExaFlops. The TPU 8t itself adopts a 3D torus topology, with a single super pod scalable to 9,600 chips, and supports expansion to over one million chips via JAX and Pathways. Key technologies include SparseCore accelerators, VPU/MXU overlap and balanced scaling, native FP4 support, and integrated Arm-based Axion CPUs to eliminate host bottlenecks. This design addresses the evolution of AI models from dense large language models to large-scale mixture-of-experts models and inference-intensive architectures. (Source: InFoQ)
GOOGLX0.99%
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 5
  • Repost
  • Share
Comment
Add a comment
Add a comment
L2LunchBoy
· 12h ago
Can FP4 precision training be stable, or is it only for inference?
View OriginalReply0
NeonIceMelt
· 12h ago
134k chip units, how to segment fault domains is a matter of expertise
View OriginalReply0
LatencyLullaby
· 13h ago
SparseCore and FP4 are natively supported; Google is really pushing down the inference costs to the limit.
View OriginalReply0
GateUser-ebdc7d3a
· 14h ago
A single super pod with a 9,600-chip capacity—how is cooling handled at this density? I’m really curious.
View OriginalReply0
ByteBard
· 14h ago
Arm Axion CPU has been integrated—heterogeneous computing is getting better and better at what it does.
View OriginalReply0