Xiaomi MiMo-V2.5 Series Open Source: 1T Parameters MIT License, ClawEval surpasses GPT-5.4 in token efficiency

According to Beating Monitoring, Xiaomi’s MiMo team has open-sourced the MiMo-V2.5 series large models, including two models, both licensed under MIT, supporting commercial deployment, continued training, and fine-tuning, with a context window of up to 1 million tokens. Among them, MiMo-V2.5-Pro is a pure text MoE model (mixture of experts architecture) with a total of 1.02 trillion parameters and 42 billion active parameters; MiMo-V2.5 is a native multimodal model with 310 billion total parameters and 15 billion active parameters, supporting text, image, video, and audio understanding.

MiMo-V2.5-Pro mainly targets complex agent and programming tasks. In the ClawEval evaluation, V2.5-Pro achieved a comparable level with a 64% Pass^3, but each task trajectory only consumed about 70k tokens, approximately 40% to 60% less than Claude Opus 4.6, Gemini 3.1 Pro, and GPT-5.4. It scored 78.9 on SWE-bench Verified. In the case studies showcased on the official blog, V2.5-Pro independently developed a complete SysY to RISC-V compiler for Peking University’s compiler principles course project, taking 4.3 hours and 672 tool calls, with a perfect score of 233/233 on the hidden test set.

MiMo-V2.5 is designed for multimodal agent scenarios. The model is equipped with a dedicated visual encoder (1M parameters ViT) and an audio encoder (1.02T parameters), scoring 62.3 on the Claw-Eval general subset. Both models adopt a hybrid architecture of sliding window attention (SWA) and global attention (GA), combined with a 3-layer multi-token prediction (MTP) module (predicting multiple tokens at once to accelerate inference). The weights have been released on Hugging Face.

Along with the open-source release, the MiMo team has simultaneously launched the “Orbit Trillion Token Creator Incentive Program,” offering a total of 1 trillion tokens free to global users within 30 days. Individual developers, teams, and enterprises can submit applications on the event page; the evaluation cycle is about 3 working days. After approval, benefits are credited in the form of Token Plans or grants, which can be directly used with programming tools like Claude Code, Cursor, and others.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin