Goldman Sachs: What does DeepSeek V4 mean for China's AI?

Question

Author: Bao Yilong, Wall Street Insights

Goldman Sachs believes that the core significance of DeepSeek V4 lies in supporting more complex intelligent agent applications at lower costs, thereby opening new space for AI application scaling.

On April 24, Goldman Sachs' Ronald Keung team published a research report stating that the all-new open-source V4 model continues the efficiency-first, open-source approach of DeepSeek.

At the technical level, V4 achieves significant cost reductions in long context windows through architectural upgrades and explicitly bets on Huawei's domestically produced chips. At the market level, this release accelerates the intense competition among Chinese AI models, with programming ability, task completion rate, and multimodality becoming the core dividing lines for pricing power.

Goldman Sachs maintains its recommended ratings for cloud computing and data center sectors, as continuous improvements in computing power cost efficiency will drive faster AI application penetration. The dual growth of enterprise AI agents and consumer AI assistants will support ongoing enhancement of cloud service pricing capabilities.

V4 Architecture Upgrade, Supporting Longer Contexts with Less Memory

DeepSeek V4 is released in two versions: Pro and Flash.

The Pro version is flagship-scale, with 1.6 trillion parameters (16k active parameters); the Flash version is relatively lightweight, with 284 billion parameters (130 billion active parameters). Both models support an ultra-long context window of 1 million tokens, comparable to top US models (SOTA), but require significantly less memory and KV cache.

According to Goldman Sachs' report, V4 Pro requires only 27% of the FLOPs for inference on 1 million tokens compared to DeepSeek V3.2, with KV cache usage at just 10%; V4 Flash is even more aggressive, reducing FLOPs to 10% and KV cache to 7%.

This efficiency leap is achieved through three key architectural innovations:

In terms of hybrid attention mechanisms, V4 introduces a hybrid architecture combining compressed sparse attention (CSA) and heavily compressed attention (HCA). CSA compresses the KV cache along the sequence dimension before sparse attention computation, while HCA adopts more aggressive compression but retains dense attention. Together, they greatly reduce the temporary memory needed for long inputs.
For training stability, V4 introduces the mHC mechanism, enhancing the stability of information transfer across multiple network layers;
Simultaneously, Muon is used as the primary training optimizer (with some modules retained as AdamW) to accommodate more complex network architectures than V3, improving convergence quality during training.

Goldman Sachs points out that these efficiency gains are especially significant for long-term task scenarios, such as long-cycle agent tasks requiring processing large amounts of context.

It is worth noting that DeepSeek currently remains focused on foundational text models, while internet giants like Alibaba, ByteDance, MiniMax, and independent model players tend to pursue multimodal or all-modal routes, showing a clear divergence in the path toward AGI.

Domestic Chips Accelerate Deployment, Huawei Ascend 950 Paves the Way for Price Reductions

Another important signal from the V4 release is that DeepSeek explicitly includes the mass production of Huawei's Ascend 950 supernodes in its commercial roadmap.

DeepSeek expects that as Huawei's Ascend 950 supernodes achieve large-scale supply in the second half of 2026, the API pricing for V4 Pro will see a significant decrease.

Goldman Sachs' report indicates that this statement has dual implications:

First, DeepSeek's cost competitiveness will be further strengthened, creating conditions for broader application deployment; second, against the backdrop of ongoing chip tightening, the trend of top Chinese AI models migrating to domestically produced computing power is clearly endorsed by leading players.

Based on current pricing, Goldman Sachs data shows that V4 Pro's pricing on mainstream API platforms is already competitive, and as domestic computing power supply expands, this advantage is expected to further grow in the second half of 2026.

Chinese AI Model Competition Enters a Differentiated Stage

The open-source release of DeepSeek V4 has quickly triggered a new round of intensive follow-up from Chinese AI model players.

According to Goldman Sachs' overview, recent players launching new models include: Kimi K2.6, Alibaba Qwen3.6-Max, Tencent Hy3 preview, Xiaomi V2.5, and the MiniMax M3/Hailuo expected to launch in May.

In Goldman Sachs' view, the key differentiating factors that will determine future pricing power of these models will focus on two dimensions:

Programming/task success rate, with Zhipu's GLM model ranking top in coding ability;
Multimodal capability, with ByteDance, Alibaba, and MiniMax investing most deeply in this area.

The report points out that the advantages and disadvantages of these two types of players are clear:

Independent AI players, such as MiniMax, are highly organized with short decision-making chains, and even with very low basic text API prices, can still achieve a 40% gross margin, according to Goldman Sachs' forecast.
Internet giants, like ByteDance, Tencent, and Alibaba, have abundant core business cash flow, making them more suitable for deploying AI infrastructure and cloud tracks. They need to set up independent AI teams with incentive schemes to retain talent, such as ByteDance's Doubao team which already has independent incentives.

It is also noteworthy that Goldman Sachs' report cites news reports indicating that Tencent and Alibaba are in talks to invest in DeepSeek at valuations exceeding $20 billion, while Zhipu and MiniMax have latest market caps of approximately $53 billion and $31 billion, respectively. This potential deal reflects the giants' strategic competition for scarce top-tier AI capabilities.

Unchanged Track Logic: Cloud Computing and Data Centers

Goldman Sachs maintains its view that cloud computing and data centers remain the top sub-sectors for China's internet sector, based on:

The continued growth in AI token (word) demand will drive increased cloud service procurement;

Growth in enterprise customers and AI agents is improving cloud/token pricing power;

The ongoing penetration of consumer AI assistants contributes incremental demand.

In the B2B enterprise cloud market, Alibaba leads with the largest external AI cloud revenue; in the B2C consumer market, ByteDance currently has the highest daily token usage for AI chatbots. China's AIGC applications' DAU continues to grow strongly, with a 36% month-over-month increase in March 2026.

Key recommended targets, Goldman Sachs emphasizes, include GDS (Global Data Service), Century Internet, Alibaba, and Kingsoft Cloud, as core allocations to capture the dividends of China's AI infrastructure expansion.

Additionally, the second tier includes e-commerce and mobility sectors, the third tier covers AI model-related stocks, and the fourth tier involves gaming and entertainment sectors.

View Original

Goldman Sachs: What does DeepSeek V4 mean for China's AI?

V4 Architecture Upgrade, Supporting Longer Contexts with Less Memory

Domestic Chips Accelerate Deployment, Huawei Ascend 950 Paves the Way for Price Reductions

Chinese AI Model Competition Enters a Differentiated Stage

Unchanged Track Logic: Cloud Computing and Data Centers

Trending Topics

SummerCreationCamp

EventContractsLaunch

BrentReturnsTo100

IntelQ2RevenueSurges25%

UStoImpose10To12.5PercentTariffsOn60Economies

Pinned