Google Cloud A4X Max Bare Metal Instances Support 50k GPU Clusters, Network Bandwidth Doubles

robot
Abstract generation in progress
ME News Updates, April 19 (UTC+8), Google Cloud announced that its A4X Max bare-metal instance supports clusters of up to 50,000 GPUs, with network bandwidth twice that of the previous generation. This instance belongs to the Google Compute Engine accelerator-optimized machine series, which come pre-installed with NVIDIA GPUs and are designed for AI, machine learning, high-performance computing, and graphics-intensive applications. The documentation details multiple machine series including A4X Max, A4X, A4, A3, A2, G4, and G2, and recommends specific series based on workload types such as pre-training, fine-tuning, inference, graphics, and high-performance computing. Additionally, the documentation explains pricing and consumption options (on-demand, Spot, Flex-start, reserved) based on pre-installed GPUs, vCPUs, memory, and local SSDs, as well as the maintenance experience for different machine types. (Source: InFoQ)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 8
  • 2
  • Share
Comment
Add a comment
Add a comment
MossyLedger
· 27m ago
Pretraining, fine-tuning, inference—all included. Google seems to want people to be on its cloud from birth to death.
View OriginalReply0
GateUser-8da82d63
· 6h ago
Based on workload recommendations for instance types, so I don't have to calculate TFLOPS myself.
View OriginalReply0
BoredInBlockspace
· 6h ago
Reserved instances are suitable for businesses with strong determinism, but who can really say for sure that their large model training will be certain?
View OriginalReply0
GateUser-eccf92a1
· 6h ago
Google Cloud's GPU supply has finally caught up; previously, applying for A100 took forever.
View OriginalReply0
BlocktimeBarista
· 6h ago
Bare metal means no virtualization overhead, this money is well spent.
View OriginalReply0
ExitLiquidityIntern
· 6h ago
Maintaining a cluster of 50k cards... How many SREs are needed to get a good night's sleep?
View OriginalReply0
TreatMemesAsBeliefs
· 6h ago
Transparent local SSD pricing is a good thing; previously, some cloud providers' storage costs were like opening blind boxes.
View OriginalReply0
StakingDaydreamer
· 6h ago
Cloud providers have gone wild— a 50,000‑GPU cluster: are they trying to snatch supercomputing centers’ “bread and butter”?
View OriginalReply0
  • Pinned