Cambricon completes adaptation of DeepSeek-V4, code open-sourced, driving domestic chip stocks higher.

robot
Abstract generation in progress
ME News message, April 24 (UTC+8), according to Dongcha Beating monitoring, Cambricon announced that on the day of V4 release, it completed the adaptation of two models: 285B DeepSeek-V4-Flash and 1.6T DeepSeek-V4-Pro, based on the vLLM inference framework, and the adaptation code has been open-sourced to GitHub.
The adaptation speed relies on two prerequisites: first, Cambricon's self-developed NeuWare software stack natively supports mainstream frameworks such as PyTorch and vLLM, allowing rapid model migration; second, Cambricon chips natively support mainstream low-precision data formats, enabling accuracy verification without additional format conversion. For V4's new structure, Cambricon used its self-developed fusion operator library Torch-MLU-Ops to accelerate specific modules such as Compressor and mHC, and wrote hot operator kernels for sparse/compressed Attention and GroupGemm using BangC.
At the inference framework level, Cambricon supports five-dimensional hybrid parallelism (TP/PP/SP/DP/EP), communication-computation parallelism, low-precision quantization, and PD separation deployment in vLLM. The V4 technical report only mentioned verification on NVIDIA GPU and Huawei Ascend NPU, without covering the Cambricon platform; this adaptation was completed independently by Cambricon. Stimulated by the V4 release news, the A-share domestic chip sector strengthened, and Cambricon's stock price surged during the session.
(Source: BlockBeats)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned