Today’s most important event is the NVIDIA GTC Conference, which is basically an AI version of A Short History of Humanity.

robot
Abstract generation in progress

Today’s most important thing is the NVIDIA GTC conference—basically an AI version of A Brief History of Humankind.

Jensen Huang hasn’t even stepped on stage yet, but the amount of leaked information is already enough to fill a whole book.

Wang Wan has pulled together three big highlights—come on, friends, let’s go.

  1. AI computing power costs cut to one-tenth

The previous generation Blackwell was already pretty powerful, right? The next-generation chip Vera Rubin will be announced for mass production soon.

How is Vera Rubin so strong? Put simply, it’s two words: cheap.

Running the same AI model,
the number of chips is cut to one-quarter, and inference computation costs drop by 90%.
Drop by 90%, friends.
AWS, Microsoft, and Google—the three major cloud providers—are all boarding the first batch.

  1. Groq, bought for $20 billion last year, turns in its homework today

Previously, Jensen Huang said on the earnings call that Groq would be integrated as an extensible architecture into the NVIDIA ecosystem—just like how, back then, NVIDIA acquired Mellanox to round out its networking capabilities.

Groq’s LPU and NVIDIA’s GPU are in the same data center: the GPU understands the problem, and the LPU quickly spits out the answer.

With the two kinds of chips working together, in agent scenarios the latency drops directly.

AI agents do the work for people. A single task might require dozens of rounds of model adjustments back and forth. Each round burns inference compute, and users are waiting there. If it’s even a bit slower, the experience collapses.

Inference is done in two steps: first understand your question, then output the answer word by word.

GPUs are great at the first step, but for the second step—speaking words fast and stably—the LPU from Groq is stronger.

Is $20 billion expensive?

Just think about it: in the future, every company will run hundreds of agents, and each agent will adjust models thousands of times per day.

  1. NVIDIA’s OpenClaw is live—called NemoClaw

It’s an open-source platform: once enterprises install it, they can deploy AI employees to run workflows instead of real people, handle data, and manage projects.
It’s said to already be in discussions with Salesforce and Adobe.

The interesting part is that NemoClaw doesn’t require you to use NVIDIA chips.
Think about this logic.
Selling chips only earns you money from the hardware layer—setting the rules is how you earn across the whole chain. Jensen Huang has this ledger down to a science.

  1. Jensen Huang says he’ll showcase a “chip the world has never seen before”

Most likely, the next-next-generation architecture, Feynman, will make its first appearance, with mass production in 2028 using TSMC’s most advanced 1.6nm process.

Also, there’s another less-talked-about piece of information I think is pretty interesting.

NVIDIA has released laptop computer processors—two models—focused on gaming.
The companies selling graphics cards are coming to snatch the CPU business.

Wang Wan, I feel like in the future Jensen Huang is going to become a great figure of an era.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin