DeepSeek V4 has finally been released!


After comparison,
it is currently the most powerful open-source model.
In the fields of coding, mathematics, long texts, and agents, it ranks in the top tier globally, with some metrics surpassing GPT-4o and Claude Opus 4.6.

1. Version and Positioning

- V4-Pro: flagship, comparable to GPT-4o/Opus 4.6, the strongest open-source.
- V4-Flash: lightweight and fast, high cost-performance ratio.
- Architecture: 1.6T parameter MoE, approximately 370B activated each time; 1 million token context.
- Computing power: full-stack Huawei Ascend 950PR, moving away from NVIDIA.

2. Core Performance Comparison (Authoritative Evaluation)

1️⃣ Programming (Strongest Point)

- HumanEval: 90% (>Opus 4.5 88%, >GPT-4 82%).
- SWE-Bench: >80%, leading in real software engineering capability.
- Conclusion: the world's strongest AI programmer.

2️⃣ Mathematics/Reasoning

- MATH/STEM: surpasses all open-source models, comparable to GPT-4o/Opus 4.6.
- Agent capability: Agentic Coding is the best among open-source, better than Claude Sonnet 4.5, close to Opus 4.6 (non-thinking mode).

3️⃣ Long Texts

- Context: 1M tokens (≈700k Chinese characters), top three globally (only behind Gemini 3.1).
- Practical test: analyzing million-word novels/entire libraries without crashing, the strongest domestic long-text model.

4️⃣ World Knowledge

- Leading all open-source models, slightly below Gemini 3.1 Pro.

3. Overall Ranking (2026.4.24)

- Top closed-source tier:
1. Gemini 3.1 Pro (strongest in reasoning/long texts)
2. Claude Opus 4.6 (most versatile and balanced)
3. GPT-4o (strongest ecosystem)
4. DeepSeek V4-Pro (top in coding/long texts, domestic first)
- Top open-source tier:
- DeepSeek V4-Pro (absolute first, leading Llama 3/Qwen 3 comprehensively)

4. Key Advantages

- ✅ Strongest in coding: surpasses GPT-4o/Claude, engineering-level task capability.
- ✅ 1M context: top globally in long text processing.
- ✅ Domestic computing power: full-stack Ascend, only 1/70 of GPT-4’s cost.
- ✅ Open-source and commercially usable: V4-Pro/Flash open-source, MIT license.

5. Shortcomings

- Slightly inferior in overall ability compared to Gemini 3.1/Opus 4.6 (especially deep reasoning).
- Multimodal (image and text) capabilities weaker than GPT-4o/Gemini.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned