The previous generation, Blackwell, was already impressive, right? It's about to be announced that the next-gen chip, Vera Rubin, is entering mass production.

What makes Vera Rubin powerful? In short, just two words: cheap.

Running the same AI model, the number of chips is cut to a quarter, and inference computing costs drop by 90%. 90% drop, folks. AWS, Microsoft, and Google—the three major cloud providers—are directly in the first batch.

Last year's $20 billion acquisition of Groq delivers results today

Previously, Jensen Huang said in an earnings call that Groq will be integrated into NVIDIA's system as an expansion architecture, just like when they acquired Mellanox to fill in networking capabilities.

Groq's LPU sits in the same data center as NVIDIA's GPU. The GPU understands the problem, and the LPU is responsible for rapidly outputting answers.

The two chips work in division of labor, directly reducing latency in agent scenarios.

AI agents work for humans. A single task may call the model dozens of times back and forth, each round burning inference compute power. And the user is waiting there—if it's even slightly slow, the experience falls apart.

Inference is divided into two steps: first understanding your question, then outputting the answer word by word.

The GPU excels at the first step, but for the speed and stability of the second step (outputting words), Groq's LPU is stronger.

Was $20 billion expensive?

Think about it: in the future, every company will run hundreds of agents, and each agent calls the model thousands of times per day.

NVIDIA's version of OpenClaw is launched, called NemoClaw

It's an open-source platform. Enterprises can install it to deploy AI employees that run processes, handle data, and manage projects for real humans. It's said that discussions with Salesforce and Adobe are already underway.

The interesting part is that NemoClaw does not require you to use NVIDIA chips. Think about the logic. Selling chips only earns you the profit at the hardware layer; setting rules earns you profit across the entire chain. Jensen Huang has done the math clearly.

Jensen Huang said he would show a "chip the world has never seen"

It is likely the first appearance of the next-next-generation architecture, Feynman, set to go into mass production in 2028, using TSMC's most advanced 1.6nm process.

Also, there's a lesser-known piece of news I find quite interesting.

NVIDIA is releasing laptop processors, two models, focused on gaming. The graphics card maker is coming to steal the CPU's job.

Wanwan feels that Jensen Huang is destined to become a great man of his era.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
gStocksTokenizedStocksLive
4.84M Popularity
#
WeakNFPShakesRateHikeOdds
1.08M Popularity
#
PredictWorldCup🇧🇷vs🇳🇴
242.2K Popularity
#
ETHBreaks1700
152.67M Popularity
#
MetaSellsComputeTriggersChipSlump
1.41M Popularity

Pinned

Sitemap

The most important thing today is the NVIDIA GTC conference, which is like an AI version of Sapiens.

Trending Topics

gStocksTokenizedStocksLive

WeakNFPShakesRateHikeOdds

PredictWorldCup🇧🇷vs🇳🇴

ETHBreaks1700

MetaSellsComputeTriggersChipSlump

Pinned