【AI+NVDA】NVIDIA Reportedly to Launch AI Inference Chip at GTC Conference — How Does It Differ from Blackwell and Rubin Chips?

robot
Abstract generation in progress

Nvidia (US: NVDA) is about to hold its annual GTC conference. According to foreign media reports, Nvidia CEO Jensen Huang is expected to announce a new “inference” focused chip at the event, designed for running models rather than training them.

The report states that this will be Nvidia’s first new product since December last year, when it reached a non-exclusive technology licensing agreement worth $20 billion with AI chip startup Groq. After Groq’s founders and core team joined Nvidia, this marks their first product launch.

Groq is known for developing Language Processing Units (LPUs) that respond quickly to complex AI queries. Three months after the deal, Nvidia is expected to release an LPU based on Groq’s architecture, working in tandem with its upcoming flagship Vera Rubin GPU to counter competitors and support a new series of AI applications.

Sources say that over the past three years, Nvidia’s massive market value has largely been driven by its GPUs becoming the backbone of the generative AI industry, used to train models like OpenAI’s ChatGPT. Huang believes that a single system can be used both for training new AI models and for running chatbots and coding tools built on those models. Major tech companies have invested hundreds of billions of dollars deploying these systems, while also developing their own dedicated AI chips. As AI tools become more complex, such as intelligent agents, Huang may need to abandon the idea that “one GPU can handle any workload.”

The new inference chip is based on SRAM instead of HBM memory

On the other hand, with HBM memory being expensive and supply increasingly tight, memory providers like SK Hynix and Micron may struggle to meet AI demands. Nvidia’s flagship Blackwell and Rubin systems rely on high-bandwidth memory to handle the large data loads used by AI models.

Sources indicate that Nvidia’s similar Groq-like chip will use static random-access memory (SRAM) instead of the dynamic RAM (DRAM) used in HBM. SRAM is easier to obtain and more suitable for accelerating AI inference tasks.

Nvidia declined to comment on the reports.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin