On April 10th, the official DeepSeek blog published an article introducing DeepSeek V4, the flagship model that will be launched by DeepSeek. This model not only surpasses the limits of parameter scales but also promises unprecedented efficiency. It is expected that DeepSeek V4 can process 1 trillion (1T) parameters, natively supports multimodal data including text, images, videos, and audio, and has a context window of 1 million tokens (equivalent to 15-20 full novels), making it a direct competitor to Western giants like GPT-5.4 from OpenAI and Claude Opus 4.5 from Anthropic. API prices for DeepSeek V4 are 10-50 times cheaper than GPT-5.4 and Claude Opus 4.5; it is expected that DeepSeek V4 will be released as open source under the Apache 2.0 license. DeepSeek V4 can run locally on systems with two RTX 4090s or one RTX 5090. Additionally, DeepSeek introduced three revolutionary innovations for DeepSeek V4: 1. Engram memory; 2. Multidimensionally constrained hyperconnectivity (mHC); 3. Sparse attention mechanism (DSA) and Lightning indexer. Moreover, in an official statement, DeepSeek noted that due to strict US export restrictions on high-quality NVIDIA graphics processors (such as B300 and H200), DeepSeek optimized V4 to primarily rely on chips manufactured in China for deployment. Although initial training could still use NVIDIA hardware (for example, H800), the model was highly optimized for Huawei Ascend 950PR and Cambricon MLU chips.

View Original
post-image
post-image
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin