Today’s most important event is NVIDIA GTC Conference, which is basically an AI version of A Brief History of Humankind.

robot
Abstract generation in progress

The most important thing today is the NVIDIA GTC conference—it’s basically a version of A Human History of Mankind for AI.

Before Huang Renxun even takes the stage, the leaked information in advance is already enough to write a whole book.

Wanwan has rounded up three big takeaways. Come on, fat friends—let’s go with me.

1)AI compute costs are directly cut to one-tenth

The previous Blackwell was already pretty strong, right?
Next up, they’re about to announce mass production of the new-generation chip Vera Rubin.

What’s so powerful about Vera Rubin? Put simply: it’s cheap.

Run the same AI model:
the number of chips drops to one-quarter, and inference compute costs fall by 90%.
Fall by 90%, friends.
AWS, Microsoft, and Google—the three major cloud providers—will be the first to roll in.

2)The Groq they bought for $20 billion last year—handing in homework today

Back then, Huang Renxun said at an earnings call that Groq would be integrated into NVIDIA’s ecosystem as an expansion architecture—just like how they acquired Mellanox back in the day to round out networking capabilities.

Groq’s LPU and NVIDIA’s GPU are in the same data center: the GPU understands the problem, and the LPU quickly spits out the answers.

With the two kinds of chips working in tandem, the latency in Agent scenarios drops dramatically.

An AI Agent does the work for people. One task might bounce back and forth and adjust the model dozens of times; each round burns inference compute, and the user is just sitting there waiting—if it’s even a bit slow, the experience collapses.

Inference happens in two steps: first it understands your question, then it outputs the answer one character at a time.

GPUs are good at the first step, but for the second step—the speed and stability of “typing out” the words—Groq’s LPU is stronger.

Is $20 billion expensive?

Think about it: in the future, every company runs hundreds of Agents, and each Agent tunes models thousands of times every day.

3)NVIDIA’s OpenClaw launches—called NemoClaw

It’s a full open-source platform. Install it in enterprises, and you can deploy AI employees to run processes for real people, handle data, and manage projects.

Apparently it’s already in talks with Salesforce and Adobe.

The interesting part is that NemoClaw doesn’t require you to use NVIDIA chips.
You think about that logic.

Selling chips makes money only from the hardware layer—only by setting the rules can you make money across the whole chain. Huang Renxun has this账算得门儿清.

4)Huang Renxun says he’ll show “a chip the world has never seen before”

Most likely, the next-next-generation architecture, Feynman, will make its first appearance—mass production in 2028, using TSMC’s most advanced 1.6nm process.

Also, there’s another less-noticed piece of intel I think is pretty interesting.

NVIDIA is making laptop processors—two models, focused on gaming.
The guys selling graphics cards are coming to grab CPU’s meal, huh.

I feel like, Wanwan, Huang Renxun is going to become a great figure of an era in the future.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin