Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
So there's some interesting groq news making rounds about NVIDIA's strategic move in the inference space. Turns out Jensen Huang just broke down the real thinking behind why they went after Groq in the first place.
Last December, NVIDIA dropped $20 billion to acquire Groq's inference chip business. The founder Jonathan Ross and his core team came over to NVIDIA, but here's the thing—Groq still operates independently. Then at GTC this past March, they showed off the Groq 3 LPU chip built on Samsung's 4nm process. The performance numbers are pretty wild: 35x the inference throughput per megawatt on trillion-parameter models compared to NVIDIA's Blackwell NVL72.
But what really caught my attention is Huang's explanation of the market dynamics driving this. He's talking about how the inference market is splitting into different segments. For years, everyone focused on one thing: maximize throughput. But that's changing. Token economics have shifted dramatically. Different users now value different response speeds differently, and they're willing to pay accordingly.
Huang put it pretty clearly: if you can give developers faster-responding tokens that make them more productive, they'll pay premium prices for that capability. This is a relatively new market that's only recently emerged. It's essentially expanding the Pareto frontier—adding a low-latency, higher-per-token-pricing segment alongside the existing high-throughput solutions.
That's where Groq's LPU architecture comes in. It's built for deterministic low latency, which is almost the opposite of what GPUs optimize for. GPUs crush it on throughput. So the groq acquisition basically fills a gap in NVIDIA's product strategy. You can run the same model two different ways: squeeze maximum throughput on GPUs, or get ultra-low latency on Groq's LPU. Different pricing models for different use cases.
The groq news here really highlights how the AI inference market is maturing beyond just raw compute. It's about understanding what different customers actually need and building the right tool for each segment. Pretty smart move if you ask me.