## NVIDIA Rubin Platform Official Announcement: Inference Cost Reduced by 10x, GPU Reduced by 4x, Delivery in H2 2026
NVIDIA officially announced the next-generation flagship AI chip product, the Rubin platform, at CES 2025, continuing its tradition of annual iterative updates. According to CEO Jensen Huang's on-site reveal, the six core chips of Rubin have returned from the foundry and completed critical testing and validation, confirming they can be deployed as planned. This signifies that NVIDIA maintains its technological leadership in the AI accelerator field and also addresses Wall Street's concerns about competitive pressure and the sustainability of AI investments.
### Performance Soars, Costs Significantly Drop
Regarding Rubin's cena (price), NVIDIA has not yet disclosed specific quotes, but its cost-effectiveness has improved markedly. Compared to the previous Blackwell platform, Rubin's training performance has increased by 3.5 times, and inference performance by 5 times. More notably, Rubin can reduce the token generation cost during inference by 10x—this means a substantial operational cost reduction for enterprises relying on large model inference.
Additionally, Rubin reduces the GPU count needed for training hybrid expert models(MoE) by 4x. This allows companies to achieve the same performance goals with less hardware investment, directly improving return on procurement.
### Six Pillars of Technological Innovation
The Rubin platform integrates five breakthrough technologies. Among them, the all-new Vera CPU features 88 custom Olympus cores based on the Armv9.2 architecture, with single-core performance twice that of competing solutions. This CPU is specially optimized for AI agent inference and is currently the most energy-efficient processor choice in large-scale AI factories.
On the GPU side, it is equipped with the third-generation Transformer engine, providing 50 petaflops of NVFP4 compute capability. A single GPU bandwidth reaches 3.6TB/s, while the Vera Rubin NVL72 entire cabinet achieves 260TB/s—such bandwidth levels provide ample data throughput for large-scale model training and inference.
The platform also incorporates the third-generation confidential computing engine and the second-generation RAS( reliability, availability, and serviceability) engine, covering CPU, GPU, and NVLink full stack, offering real-time health monitoring, fault tolerance, and proactive maintenance functions. The entire cabinet adopts a modular design, with assembly and maintenance speeds 18 times faster than Blackwell.
### New Choice for Cloud Service Providers and AI Labs
NVIDIA announced that multiple enterprises, including major cloud infrastructure providers, will deploy the Rubin instances in the second half of 2026. These cloud providers and integrators will offer Rubin computing power leasing services to enterprise customers.
In the AI model development arena, well-known labs including OpenAI, Anthropic, Meta, Mistral AI, and xAI have stated they will use the Rubin platform to train larger and more powerful next-generation models. OpenAI CEO Sam Altman said that increased computing power directly drives the evolution of intelligent agents, and Rubin’s performance advantages will continue to propel this process. Anthropic CEO Dario Amodei pointed out that Rubin’s enhanced capabilities bring significant improvements in inference quality and model reliability. Meta CEO Mark Zuckerberg emphasized that the efficiency gains of the Rubin platform are crucial for deploying state-of-the-art AI models to billions of users worldwide.
### Industry Chain Fully Laid Out
Server hardware vendors such as Cisco, Dell, HPE, Lenovo, and Supermicro have already planned Rubin-related server product lines. This indicates that Rubin is not just a GPU innovation but a catalyst for upgrading the entire AI infrastructure ecosystem.
NVIDIA’s decision to release Rubin details earlier than usual this year reflects its strategy to maintain industry dependence and market enthusiasm. Typically, the company provides in-depth product introductions at the spring California GTC conference, but this early reveal at CES demonstrates the intensified AI competition in the market.
### Long-term Outlook
Although investors remain skeptical about NVIDIA’s sustained growth and the longevity of AI spending, NVIDIA continues to maintain its long-term growth expectations and projects that the global AI-related market size will reach trillions of dollars. The launch of the Rubin platform marks NVIDIA’s continued leadership in AI chip iteration and also signifies that Rubin cena’s cost-effectiveness will redefine corporate investment decisions in AI infrastructure.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
## NVIDIA Rubin Platform Official Announcement: Inference Cost Reduced by 10x, GPU Reduced by 4x, Delivery in H2 2026
NVIDIA officially announced the next-generation flagship AI chip product, the Rubin platform, at CES 2025, continuing its tradition of annual iterative updates. According to CEO Jensen Huang's on-site reveal, the six core chips of Rubin have returned from the foundry and completed critical testing and validation, confirming they can be deployed as planned. This signifies that NVIDIA maintains its technological leadership in the AI accelerator field and also addresses Wall Street's concerns about competitive pressure and the sustainability of AI investments.
### Performance Soars, Costs Significantly Drop
Regarding Rubin's cena (price), NVIDIA has not yet disclosed specific quotes, but its cost-effectiveness has improved markedly. Compared to the previous Blackwell platform, Rubin's training performance has increased by 3.5 times, and inference performance by 5 times. More notably, Rubin can reduce the token generation cost during inference by 10x—this means a substantial operational cost reduction for enterprises relying on large model inference.
Additionally, Rubin reduces the GPU count needed for training hybrid expert models(MoE) by 4x. This allows companies to achieve the same performance goals with less hardware investment, directly improving return on procurement.
### Six Pillars of Technological Innovation
The Rubin platform integrates five breakthrough technologies. Among them, the all-new Vera CPU features 88 custom Olympus cores based on the Armv9.2 architecture, with single-core performance twice that of competing solutions. This CPU is specially optimized for AI agent inference and is currently the most energy-efficient processor choice in large-scale AI factories.
On the GPU side, it is equipped with the third-generation Transformer engine, providing 50 petaflops of NVFP4 compute capability. A single GPU bandwidth reaches 3.6TB/s, while the Vera Rubin NVL72 entire cabinet achieves 260TB/s—such bandwidth levels provide ample data throughput for large-scale model training and inference.
The platform also incorporates the third-generation confidential computing engine and the second-generation RAS( reliability, availability, and serviceability) engine, covering CPU, GPU, and NVLink full stack, offering real-time health monitoring, fault tolerance, and proactive maintenance functions. The entire cabinet adopts a modular design, with assembly and maintenance speeds 18 times faster than Blackwell.
### New Choice for Cloud Service Providers and AI Labs
NVIDIA announced that multiple enterprises, including major cloud infrastructure providers, will deploy the Rubin instances in the second half of 2026. These cloud providers and integrators will offer Rubin computing power leasing services to enterprise customers.
In the AI model development arena, well-known labs including OpenAI, Anthropic, Meta, Mistral AI, and xAI have stated they will use the Rubin platform to train larger and more powerful next-generation models. OpenAI CEO Sam Altman said that increased computing power directly drives the evolution of intelligent agents, and Rubin’s performance advantages will continue to propel this process. Anthropic CEO Dario Amodei pointed out that Rubin’s enhanced capabilities bring significant improvements in inference quality and model reliability. Meta CEO Mark Zuckerberg emphasized that the efficiency gains of the Rubin platform are crucial for deploying state-of-the-art AI models to billions of users worldwide.
### Industry Chain Fully Laid Out
Server hardware vendors such as Cisco, Dell, HPE, Lenovo, and Supermicro have already planned Rubin-related server product lines. This indicates that Rubin is not just a GPU innovation but a catalyst for upgrading the entire AI infrastructure ecosystem.
NVIDIA’s decision to release Rubin details earlier than usual this year reflects its strategy to maintain industry dependence and market enthusiasm. Typically, the company provides in-depth product introductions at the spring California GTC conference, but this early reveal at CES demonstrates the intensified AI competition in the market.
### Long-term Outlook
Although investors remain skeptical about NVIDIA’s sustained growth and the longevity of AI spending, NVIDIA continues to maintain its long-term growth expectations and projects that the global AI-related market size will reach trillions of dollars. The launch of the Rubin platform marks NVIDIA’s continued leadership in AI chip iteration and also signifies that Rubin cena’s cost-effectiveness will redefine corporate investment decisions in AI infrastructure.