The previous-generation Blackwell was already pretty impressive, right? The new-generation chip, Vera Rubin, is about to be announced for mass production.

What’s so powerful about Vera Rubin? To put it plainly: it’s cheap.

Running the same AI model, the number of chips is cut to one quarter, and the inference compute cost drops by 90%. Drops by 90%, friends. AWS, Microsoft, and Google—the three major cloud providers—will be the first batch to get on board.

2）Groq, bought last year for $20 billion, turns in its homework today

Earlier, Jensen Huang said at an earnings call that Groq would be connected to NVIDIA’s ecosystem as an extension architecture—just like back when NVIDIA acquired Mellanox to make up for and strengthen its networking capabilities.

Groq’s LPU and NVIDIA’s GPU are in the same data center: the GPU understands the problem, and the LPU quickly spits out the answers.

With the two chip types splitting the work and coordinating in agent scenarios, latency gets knocked down directly.

AI agents do the work for people. For a single task, you might adjust the model dozens of times back and forth. Each round burns inference compute, and users are just sitting there waiting—if it’s slower, the experience collapses.

Inference happens in two steps: first, understand your question; then, output the answer word by word.

GPUs are good at the first step, but for the second step—the speed and stability of outputting words—Groq’s LPU is stronger.

Is $20 billion expensive?

Just think about it: in the future, every company will run hundreds of agents, and each agent will adjust the model thousands of times every day.

3）NVIDIA’s version of OpenClaw launches—called NemoClaw

It’s an open-source platform. Once enterprises install it, they can deploy AI employees to run workflows for humans, process data, and manage projects. It’s said to already be in talks with Salesforce and Adobe.

What’s interesting is that NemoClaw doesn’t require you to use NVIDIA chips. Think about that logic. Selling chips only makes money on the hardware layer—setting the rules is what lets you earn across the entire chain. Jensen Huang has this figured out extremely well.

4）Jensen Huang says he’s going to showcase a “chip the world has never seen”

Most likely, it’s the next-next-generation architecture, Feynman, making its first appearance, with mass production in 2028 using TSMC’s most advanced 1.6nm process.

Also, there’s another niche piece of information I think is quite interesting.

NVIDIA has come out with laptop computer processors—two models—geared toward gaming. Sellers of graphics cards are about to come grab the CPU business as well.

Wang Wan’s feeling is that in the future, Jensen Huang is going to become a great figure of the age.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
IntroducingGateStocks
42.33K Popularity
#
WinGoldBarsWithGrowthPoints
1.26M Popularity
#
ArthurHayesSeesHYPEOvertakingSOL
18.19M Popularity
#
USIranNegotiationGame
9.57M Popularity
#
SaylorHintsAtMoreBTC
803.64K Popularity

Pinned

Sitemap

Today’s most important event is NVIDIA GTC Conference, which is basically an AI version of A Short History of Humanity.

Trending Topics

IntroducingGateStocks

WinGoldBarsWithGrowthPoints

ArthurHayesSeesHYPEOvertakingSOL

USIranNegotiationGame

SaylorHintsAtMoreBTC

Pinned