Today’s most important event is NVIDIA GTC Conference, which is basically an AI version of A Brief History of Humankind.

robot
Abstract generation in progress

Today’s most important event is the NVIDIA GTC conference, which is essentially an AI version of “A Brief History of Mankind.”

Before Jensen Huang even takes the stage, the amount of leaked information is already enough to write a book.

Wang Wang has organized three major highlights, come on, friends, follow me.

  1. AI computing power costs are directly cut by 90%.

The previous generation Blackwell was already impressive, right? A new generation chip, Vera Rubin, is about to be announced for mass production.

What’s impressive about Vera Rubin? Simply put, it’s two words: cheap.

For the same AI model, the number of chips is reduced to a quarter, and inference computing costs drop by 90%. That’s a 90% reduction, friends. AWS, Microsoft, and Google, the three major cloud providers, are the first to adopt.

  1. Groq, which was acquired for $20 billion last year, delivers its results today.

Previously, Jensen Huang mentioned in the earnings call that Groq would be integrated into the NVIDIA ecosystem as an expansion architecture, just like how acquiring Mellanox filled the networking capability gap back then.

Groq’s LPU and NVIDIA’s GPU will be in the same data center; the GPU handles understanding the problem, while the LPU is responsible for quickly delivering the answer.

The division of labor between the two chips directly reduces latency in Agent scenarios.

AI Agents do the work for humans; a task may require dozens of model adjustments, and each round consumes inference computing power, while users are waiting, so any delay can ruin the experience.

Inference is divided into two steps: first, understanding your question, then outputting the answer word by word.

GPUs excel at the first step, but for the speed and stability of the second step of outputting words, Groq’s LPU is stronger.

Is $20 billion expensive?

Just think about it; in the future, every company will run hundreds of Agents, and each Agent will adjust models thousands of times a day.

  1. NVIDIA’s version of OpenClaw is launched, called NemoClaw.

It’s an open-source platform that companies can install to deploy AI employees to run processes, handle data, and manage projects. It is said to be already in talks with Salesforce and Adobe.

What’s interesting is that NemoClaw does not require you to use NVIDIA’s chips. Think about this logic. Selling chips only earns money on the hardware level; setting the rules allows you to earn money across the entire chain. Jensen Huang has this figured out clearly.

  1. Jensen Huang said he would showcase “chips the world has never seen before.”

It is highly likely that the next-generation architecture Feynman will make its first appearance, set for mass production in 2028, using TSMC’s most advanced 1.6nm process.

Additionally, there’s a lesser-known tidbit that I find quite interesting.

NVIDIA has released laptop processors, two models, aimed at gaming. Those selling graphics cards are now coming to snatch up the CPU market.

Wang Wang feels that Jensen Huang is going to become a great figure in the future.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin