Ark Invest: The Current State and Future of AI Infrastructure

金色财经_

2026-03-26 07:32:37

Source: Frank Downing, Ark Invest; Translation: Golden Finance Claw

Explosion in AI Infrastructure Spending

In the three years since ChatGPT’s release, the demand for accelerated computing has skyrocketed. NVIDIA’s annual revenue has surged nearly 8 times, from $27 billion in 2022 to $216 billion in 2025, with market consensus expecting a further 62% growth in 2026, reaching $350 billion. Global data center system investments (including computing, networking, and storage hardware) have accelerated from an average annual growth of 5% over the past decade up to 30% in the last three years, and are expected to grow over 30% again by 2026, reaching $653 billion.

ARK’s research shows that accelerated computing driven by GPUs and AI-specific chips (ASICs) now dominates server investments, accounting for 86% of server sales.

Rapid Cost Decline Fuels Adoption of Accelerated Computing

The ongoing increase in spending on infrastructure needed to run AI models is driven by expanding use cases of generative AI in consumer and enterprise sectors, as well as the need to train smarter foundational models in the pursuit of “superintelligence.”

The rapid decline in costs further accelerates demand growth. Our research indicates that AI training costs decrease by 75% annually. Inference costs are dropping even faster—in benchmark tests tracked by Artificial Analysis, over 50% of models show an annualized cost reduction of up to 95%.

Two forces are driving these cost reductions: first, industry leaders like NVIDIA release new products annually, delivering continuous hardware performance improvements; second, software algorithm improvements increase training and inference efficiency on the same hardware.

Strong Signals from Consumers and Enterprises

Consumer adoption of AI is happening much faster than internet adoption at the same stage. The AI adoption rate has expanded to about 20% in three years, more than twice the speed of internet adoption.

Enterprise demand is also growing rapidly. For example, according to OpenRouter data, since December 2024, token demand has increased 28-fold.

Over the past two years, AI labs favored by enterprises, such as Anthropic, have achieved astonishing revenue growth—about 100 times—rising from $100 million in annualized revenue at the end of 2023 to an estimated $8-10 billion by the end of 2025. Anthropic’s momentum continues into 2026, with announced annualized revenue reaching $14 billion and a $30 billion funding round valuing the company at $380 billion.

OpenAI, competing on both consumer and enterprise fronts, has also seen strong growth among enterprise clients, reaching 1 million corporate customers by November 2025. CFO Sarah Friar states that enterprise revenue growth outpaces consumer business, and by 2026, it is expected to account for 50% of total revenue. In a January 2026 blog, Friar explained the rationale for further infrastructure investment: over the past three years, OpenAI’s revenue has grown proportionally with its computing capacity.

Private Markets Fuel AI Infrastructure Investment

To meet strong demand signals, large-scale infrastructure investments are necessary. According to Crunchbase, private AI lab funding exceeded $200 billion in 2025, with about $80 billion flowing to foundational model developers like OpenAI, Anthropic, and xAI. In the public markets, mega cloud providers are deploying cash reserves and seeking additional financing to support AI capital expenditures—projected to reach up to $700 billion in 2026.

Reportedly, Meta’s $30 billion deal with Blue Owl is the largest private capital transaction ever. Structured as a joint venture primarily financed through debt, its special purpose vehicle (SPV) structure allows project debt to be off-balance-sheet for Meta, sparking considerable controversy.

AMD and Other Competitors Challenge NVIDIA

Beyond physical data centers, compute chips remain central to AI capital expenditure. NVIDIA has led the era of accelerated computing, but now the largest AI chip buyers are trying to maximize AI compute per dollar invested. Since acquiring ATI Technologies in 2006, AMD has been a key player alongside NVIDIA in consumer GPU markets, and has recently become an emerging competitor in enterprise markets. Since launching its EPYC processor line in 2017, AMD’s server CPU market share has grown from nearly zero to 40% in 2025.

For small model inference, AMD GPUs now offer comparable total cost of ownership (TCO) relative to performance with NVIDIA. TCO considers both upfront purchase costs (capital expenditure) and operational costs over the chip’s lifespan (operating expenses). Using SemiAnalysis’s InferenceMax metric—measuring tokens processed per second per GPU optimized for throughput—and cost estimates based on hourly capital and operational expenses, AMD’s GPUs are competitive.

While AMD has caught up in small model performance, NVIDIA still maintains a significant lead in large model performance, as shown below.

NVIDIA’s rack-scale Grace Blackwell solution connects 72 Grace Blackwell GPUs (GB200) to operate as a shared-memory super-large GPU. This tight interconnect enhances inference capabilities for large models, which require distributing weights across multiple GPUs and more communication bandwidth. To narrow the gap before NVIDIA’s Vera Rubin release, AMD’s rack-scale solution is scheduled for launch in late 2026. So far, AMD has secured orders from clients including Microsoft, Meta, OpenAI, xAI, and Oracle.

Leading the Custom Chip Revolution: Major Cloud Providers

Besides commercial GPU suppliers, mega cloud providers and AI labs aim to develop their own chips to reduce NVIDIA’s influence and lower AI computing costs. Google has been designing its own AI-specific chips—Tensor Processing Units (TPUs)—for over a decade, used for running recommendation models in search and optimized for generative AI on the latest TPU v7. SemiAnalysis estimates that Google’s in-house TPU processing reduces internal compute costs by 62% compared to NVIDIA. Anthropic and Meta are expanding their compute capacity using Google’s TPUs, which supports the estimate.

Amazon’s Trainium chips appear to be a more modest solution. After acquiring Annapurna Labs in 2015, Amazon developed custom chips for its cloud business, expanding ARM-based Graviton CPUs and Nitro DPU (Data Processing Units) to support AWS’s critical compute needs. Recently, Amazon announced that in 2025, Graviton provided over half of the new CPU capacity added to AWS for the third consecutive year. Besides TPUs, Anthropic also uses AWS and Trainium as its primary training platforms.

Microsoft entered the custom chip space later, releasing the Maia 100 AI accelerator in 2023, initially focused on inference rather than generative AI. Its second-generation product is now launching, targeting AI inference workloads.

Broadcom Leads the Custom Chip Service Market

Google and Amazon focus on front-end chip design (architecture and functionality), while backend partners handle silicon fabrication, advanced packaging, and coordination with foundries like TSMC. Amid challenges in Intel’s foundry business, TSMC has become the preferred partner for most major AI chip projects, with Broadcom emerging as a leading backend design partner for Google’s TPU, Meta’s MTIA, and OpenAI’s upcoming custom chips in 2026. Apple traditionally handles full design of its mobile and PC chips but is also rumored to be collaborating with Broadcom on AI chips. Citibank predicts Broadcom’s AI revenue could grow fivefold over the next two years, from $20 billion in 2025 to $100 billion in 2027.

Amazon’s Trainium development path is unusual—reports indicate Trainium 2 partnered with Marvell, but due to Marvell’s underperformance, Trainium 3 and 4 shifted to Alchip. Amazon’s ability to change backend partners highlights the risks of vertical integration for companies like Broadcom. Notably, Apple and Tesla directly collaborate with foundries. Google may do the same with its TPU v8, which has two SKUs—one co-designed with Broadcom, and another independently designed and managed by Google with MediaTek support.

Chip Startup Activity Heats Up

Our research indicates that a long tail of startups experimenting with new architectural paradigms could further challenge existing chip vendors. Cerebras, known for its wafer-scale engine—a giant chip made from a single silicon wafer about the size of a pizza box—offers the fastest token processing speeds on the market and plans to go public this year. It recently announced a partnership with OpenAI to launch the high-speed programming model Codex Spark, following a previous collaboration in January. Groq, also known for its impressive token throughput, recently signed a $20 billion non-exclusive IP licensing deal with NVIDIA, including 90% of Groq’s staff and CEO/TPU co-founder Jonathan Ross. This structure effectively amounts to an acquisition of Groq’s team and technology, a trend increasingly common in M&A as tech giants seek to avoid regulatory delays. In other M&A news, Intel, after reportedly failing to acquire a target, has shifted to a partnership with SambaNova. Since 2014, Intel has made four acquisitions in AI but has yet to launch a widely recognized AI product, a somewhat disappointing record.

Looking Ahead: Market Could Reach $1.4 Trillion by 2030

Our projections suggest that sustained demand growth and performance improvements over the next five years will drive AI software and cloud services development, with AI infrastructure spending tripling—from $500 billion in 2025 to nearly $1.5 trillion in 2030.

This forecast is based on historical data of data center system investments relative to software revenue. In the early 2010s, with the rise of cloud computing, system investments accounted for about 50% of global software spending. By 2021, after pandemic-driven overinvestment and customer optimization, this ratio fell below 20%. Our $1.5 trillion estimate assumes that by 2030, investments will be about 20% of our neutral scenario for global software spending (which we project at $7 trillion in 2030). We detailed this assumption in a blog last year. We believe this 20% level adequately accounts for potential overinvestment risks before 2030 and the possibility that software revenue growth may be slower than the neutral scenario—if so, infrastructure investments could continue to grow rapidly, similar to the early 2010s.

As AI-driven compute demand continues to rise, we expect the share of custom chips in total computing expenditure to increase, as designing chips for specific workloads will demonstrate increasingly important performance-per-dollar advantages at scale. We estimate that by 2030, custom ASICs could account for over one-third of the computing market.

Overall, our research indicates that current infrastructure buildout is not a bubble about to burst but a rare platform-level transformation. ARK forecasts that by 2030, annual AI infrastructure spending will approach $1.5 trillion, driven by genuine and accelerating demand from consumers and enterprises. Falling costs continue to validate and unlock new use cases. We believe that the companies that will stand out in the next five years are those capable of designing the most efficient chips, building the most powerful models, and deploying both at scale.

As NVIDIA CEO Jensen Huang stated during the Q4 FY2026 earnings call, truly practical AI agents have only just begun to be deployed at scale in recent months. They consume enormous amounts of tokens but far surpass most users’ previous AI experiences. Scaling these agents to millions of enterprises will be highly compute-intensive, but the productivity gains from doing so will make these investments well worth it.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.