On April 17, 2026, OpenAI announced it would purchase over $20 billion worth of chips from another AI chip company, Cerebras. On the same day, Cerebras officially filed for an IPO on NASDAQ, aiming for a valuation of $35 billion.

Two transactions, with nearly identical amounts. One is an acquisition, the other is a purchase. One comes from the world’s largest AI chip seller, the other from the world’s largest AI buyer.

These are not two separate events; they are two symmetrical moves in the same war. The battlefield is called: AI inference.

Most people haven’t noticed this war because it has no explosions—only financial reports and technical discussions circulating among Silicon Valley engineers. But its impact could be more profound than any AI launch in the past two years—because it is redistributing control over a market that is almost certain to become the largest tech market in history.

What is inference, and why is the keyword in 2026 no longer “training”?

Before discussing the two $20 billion figures, it’s necessary to understand a background: the AI chip battlefield is undergoing a shift in focus.

Training and inference are the two stages of AI computational power consumption. Training involves creating models—feeding massive amounts of data into neural networks to teach them certain abilities. This process usually happens once or periodically. Inference is using the model—each time a user asks a question, ChatGPT provides an answer, which is essentially an inference request.

In 2023, the majority of global AI compute expenditure was on training, with inference playing a supporting role.

But this ratio is rapidly reversing.

According to market research data from Deloitte and CES 2026, in 2025, inference already accounted for 50% of all AI compute expenditure; by 2026, this will jump to two-thirds. Lenovo CEO Yang Yuanqing stated more plainly at CES: the structure of AI spending will flip from “80% training + 20% inference” to “20% training + 80% inference.”

The logic is simple. Training is a one-time cost, inference is a continuous cost. GPT-4 was trained once, but it answers hundreds of millions of questions daily—each conversation is an inference request. After large-scale deployment, the cumulative inference consumption far exceeds training.

What does this mean? It means the most profitable piece of the AI industry is shifting from “training chips” to “inference chips.” And these two types of chips require fundamentally different architectures.

Nvidia’s problem: chips designed for training are inherently not optimized for inference

Nvidia’s H100 and H200 are monsters built for training. Their core advantage is extremely high computational throughput—training requires performing massive matrix multiplications, and GPUs excel at this “multi-core parallel computing.”

But the bottleneck in inference is not computation; it’s memory bandwidth.

When a user issues a question, the chip needs to “move” the entire model’s weights from memory to the compute units before generating an answer. This “move” process is the true source of inference latency. Nvidia’s GPUs use external high-bandwidth memory (HBM), and this data transfer inevitably introduces delays— for ChatGPT handling tens of millions of requests per second, this latency, scaled up, becomes a real performance bottleneck.

Internal engineers at OpenAI noticed this issue when optimizing Codex (a code generation tool), and found that no matter how they tuned parameters, response speed was limited by Nvidia GPU architecture constraints.

In other words, Nvidia’s disadvantage in inference isn’t about effort; it’s about architecture.

Cerebras’ WSE-3 chip takes a completely different approach. This chip is so large it requires wafer-level packaging—an area of 46,255 square millimeters, larger than a human palm—integrating 900k AI cores and 44GB of ultra-fast SRAM on a single silicon wafer. Memory is directly attached next to the compute cores, reducing the “move” distance from centimeters to micrometers. The result: inference speed is 15 to 20 times faster than Nvidia’s H100.

It’s worth noting: Nvidia isn’t sitting still. Its latest Blackwell (B200) architecture has improved inference performance fourfold over H100 and is being deployed at scale. But Blackwell is chasing a moving target—Cerebras is also iterating, and the entire chip market is seeing emerging competitors beyond Cerebras.

Nvidia’s $20 billion: a historical acquisition with an acknowledgment letter

On December 24, 2025, Nvidia announced its largest acquisition ever.

Target: Groq.

Groq is a competitor similar to Cerebras, also specializing in SRAM-architecture chips optimized for inference—its chip, called LPU (Language Processing Unit), was rated as the fastest inference chip in public evaluations at the time. Nvidia spent $20 billion to acquire Groq’s core technology and founding team, including founder Jonathan Ross and top chip engineers from Google’s TPU team.

This was Nvidia’s largest acquisition since its $7 billion purchase of Mellanox in 2019, tripling in size.

Many analysts see the message behind this huge sum as more important than the amount itself: Nvidia believes it has a structural gap in inference, and that gap is so significant it’s worth spending $20 billion to fill.

If Nvidia truly believed its GPUs were invincible in inference, it wouldn’t need to acquire Groq. Essentially, this is a $20 billion tech procurement order—an admission that SRAM embedded architectures have real technical advantages in inference, and that Nvidia’s current product line can’t naturally cover this advantage. It’s buying a technical gap it cannot fill on its own at the highest price.

Of course, Nvidia’s official narrative after the acquisition is different—“deep integration with Groq to provide a more complete inference solution.” The technical translation: “We realize our current offerings are insufficient, so we bought someone else’s technology.”

OpenAI’s $20 billion: buying chips is superficial, equity participation is the real key

Now, back to OpenAI.

In January 2026, OpenAI signed a $10 billion, three-year compute supply agreement with Cerebras—initially reported as “OpenAI diversifying chip suppliers,” with a tone of casualness.

But the latest details revealed on April 17 fundamentally change the nature of this deal:

First, the purchase amount doubled from $10 billion to $20 billion.

Second, OpenAI will receive warrants for Cerebras’ shares, which could give it up to 10% ownership as the scale increases.

Third, OpenAI will also provide $1 billion in data center construction funds to Cerebras—meaning OpenAI is helping Cerebras build factories.

Taken together, these three details paint a very different picture: OpenAI isn’t just buying chips; it’s incubating a supplier.

This logic has clear precedents in tech history. In 2006, Apple began working with Samsung to customize A-series chips—initially a large procurement deal. But as Apple deepened its involvement and eventually developed its own M-series chips, control of the supply chain shifted entirely from Intel and Samsung to Apple itself. What OpenAI is doing is somewhat similar—but with an important boundary: Apple has always owned the chip design rights from the start, whereas OpenAI is still a purchaser. After Cerebras goes public, it will develop independently and serve more clients. The endgame may not be OpenAI fully controlling Cerebras but rather establishing a deeply interconnected ecosystem.

On one hand, binding Cerebras with $20 billion and equity stake ensures continuous supply of inference compute outside Nvidia; on the other hand, OpenAI is collaborating with Broadcom to develop its own ASIC chips, expected to mass-produce by late 2026. Running both strategies simultaneously aims for compute independence.

What does Cerebras’ IPO mean?

On April 17, Cerebras officially filed for a NASDAQ IPO, targeting a valuation of $35 billion, planning to raise $3 billion.

This valuation is more than four times its $8.1 billion valuation in September 2025. In February, it completed a new funding round at a valuation of $23 billion, and the IPO target of $35 billion is a 52% premium over that.

Those familiar with Cerebras’ history know this is its second attempt at going public. The first was in 2024, but it was withdrawn because its key customer G42 (the UAE sovereign tech investment fund) accounted for 83%–97% of revenue that year, and CFIUS intervened citing national security concerns.

This time, G42 has disappeared from the shareholder list, replaced by OpenAI.

In other words, Cerebras’ customer concentration problem has not been fundamentally solved—just the big customer has changed, but dependence remains. Investors must judge: is this customer better or worse? From a credit perspective, OpenAI is clearly better than G42; strategically, OpenAI is also a competitor incubator—its own ASICs, once mature, pose a real threat to Cerebras.

Fairly, Cerebras is actively expanding other customers, and its prospectus is expected to list more diversified revenue sources, which should improve concentration. But until OpenAI’s self-developed chips are mass-produced, the answer remains uncertain.

Buying Cerebras stock is essentially betting that: OpenAI will continue to choose Cerebras, and that OpenAI’s own ASICs won’t come online prematurely. Both are uncertain.

Of course, the bullish case is real: if inference market size grows as projected, even a small share for Cerebras could be huge in absolute numbers. The question isn’t whether Cerebras has a chance, but whether the $35 billion valuation already reflects the most optimistic scenario.

Two $20 billion deals, appearing symmetrically between late 2025 and April 2026.

One from the world’s largest AI chip seller, acquiring a competitor’s inference technology.

One from the world’s largest AI buyer, incubating a company challenging Nvidia in inference.

Nvidia’s $20 billion is a defensive move—using the highest price to fill a technical gap it cannot close itself.

OpenAI’s $20 billion is an offensive move—burning money to build an inference highway independent of Nvidia, while also acquiring a toll booth in the form of equity warrants.

This war has no gunfire, but the flow of capital never lies. These two deals tell you more clearly than any AI launch: control over AI inference infrastructure is being contested. And by 2026, this market will account for two-thirds of the industry’s compute expenditure.

Cerebras’ IPO is the clarion call of this war.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
GatePreIPOsLaunchesWithSpaceX
210.81K Popularity
#
Gate13thAnniversaryLive
657.11K Popularity
#
AltcoinsRallyStrong
7.33M Popularity
#
AnthropicvsOpenAIHeatsUp
1.07M Popularity
#
KalshiFacesNevadaRegulatoryClash
468.26K Popularity

Sitemap

Two $20 billion entities: OpenAI and NVIDIA are fighting a "reasoning battle"

Trending Topics

GatePreIPOsLaunchesWithSpaceX

Gate13thAnniversaryLive

AltcoinsRallyStrong

AnthropicvsOpenAIHeatsUp

KalshiFacesNevadaRegulatoryClash

Pin