At the NVIDIA GTC conference, Jensen Huang introduced the new generation Vera Rubin chip, significantly reducing AI computing power costs, and Groq integrated to enhance AI Agent efficiency. At the same time, the NemoClaw open-source platform was released, allowing enterprises to deploy AI employees. Soon, a new architecture called Feynman may be launched, marking NVIDIA's entry into the laptop CPU market and demonstrating its significant progress in AI and chip fields.

SmartContractAuditor

2026-05-17 10:21:49

Abstract generation in progress

Today’s most important thing is the NVIDIA GTC conference—seriously, it’s an AI version of A Brief History of Humankind.

Huang Renxun hasn’t even taken the stage yet, but the information leaked in advance is already enough to fill a book.

Wanwán has rounded up three big takeaways—come on, friends. Follow me.

1）AI computing power costs hit a 10-fold discount

The previous generation Blackwell was already extremely powerful, right? The next-generation chip Vera Rubin is about to be announced for mass production.

What’s so powerful about Vera Rubin? To put it plainly, it’s two words: cheap.

Run the same AI model, the number of chips is cut to one quarter, and inference computation costs drop by 90%. Drop by 90%, friends. AWS, Microsoft, and Google—the three major cloud providers—are the first to get onboard.

2）Groq, acquired last year for $20 billion—turning in the homework today

Earlier, during an earnings call, Huang Renxun said that Groq would connect into NVIDIA’s ecosystem as an extension architecture, just like how, back then, acquiring Mellanox helped round out networking capabilities.

Groq’s LPU and NVIDIA’s GPU are housed in the same data center. GPUs understand the problem, while the LPU rapidly spits out the answers.

With the two types of chips dividing the work and working together, latency in Agent scenarios drops immediately.

AI agents do the work for humans. One task can go back and forth and require dozens of rounds of model tweaking. In every round, inference compute is being burned—and the user is just sitting there waiting. If it’s even a bit slower, the experience collapses.

Inference is done in two steps: first, understand your question; then output the answer word by word.

GPUs are good at the first step, but for the speed and stability of “spitting out words” in the second step, Groq’s LPU is stronger.

Is $20 billion expensive?

Think about it—later, every company will run hundreds of agents, and each agent will adjust models thousands of times every day.

3）NVIDIA’s OpenClaw is going live—called NemoClaw

It’s an open-source platform. Once enterprises install it, they can deploy AI employees to run workflows for real people, process data, and manage projects. It’s said that it’s already in talks with Salesforce and Adobe.

The interesting part is that NemoClaw doesn’t require you to use NVIDIA chips. Think about that logic. Selling chips makes money only from the hardware layer; only by setting the rules can you capture profits across the entire ecosystem chain. Huang Renxun clearly has this figured out.

4）Huang Renxun says he will showcase “a chip the world has never seen before”

Most likely, it’s the next-next-generation architecture Feynman making its first appearance, with mass production in 2028 and TSMC’s most advanced 1.6nm process.

Also, there’s one lesser-known piece of intel I think is pretty interesting.

NVIDIA has made laptop computer processors—two models—mainly for gaming. Sellers of graphics cards are coming to grab the CPU business.

My feeling is, Wanwán—Huang Renxun is going to become a great figure of an era in the future.

NVDAX0.81%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.