MiniMax M3 has a pretty serious parameter stack—wait 10 days for the open-source release and you’ll see what everyone means by “so worth it.”

View Original
CoinNetwork
MiniMax releases M3 large model: programming ability surpasses GPT-5.5, supports native multimodal desktop control
CryptoWorld News reports that MiniMax officially released its large model M3 today. M3 is currently the only open-source large model that brings together three cutting-edge elements—programming, ultra-long context, and native multimodality—and it plans to formally open-source its weights within 10 days. It reaches international leading levels in code generation, agents, and desktop control, and can be experienced in MiniMax code, token plan, and API. M3 pioneered a sparse attention architecture called MSA, which aggregates hit queries across KV blocks, making memory access 4 times faster than Flash-sparse-attention. With a 1 million context window, the new architecture reduces per-token computation to one twentieth of the previous generation, achieving 9x prefill and 15x decoding speedups. On SWE-bench pro, M3 scored 59.0%, surpassing GPT-5.5 and Gemini 3.1.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned