Microsoft AI releases its first inference model, MAI-Thinking-1, and six native new models, launching a private reinforcement learning “Frontier Fine-tuning” service

According to Beating monitoring, Mustafa Suleyman, head of Microsoft’s AI division (Microsoft AI, abbreviated as MAI), announced the launch of a new in-house MAI native model family at the Build 2026 developer conference. The family includes a total of 7 models, covering reasoning, programming, images, transcription, and speech. All of them have been trained from scratch by Microsoft, without using any third-party models for knowledge distillation, and all datasets have obtained compliant authorization. Microsoft said it will work to build “Humanist Superintelligence,” ensuring that state-of-the-art AI serves as an auxiliary tool for humans and remains under human oversight. Meanwhile, Microsoft’s GB200 computing cluster, which has already been deployed, is now fully in operation and is being used to drive continuous iteration of this model ecosystem.

The flagship reasoning model in the MAI family, MAI-Thinking-1, has 35 billion active parameters, uses a Mixture of Experts (MoE) architecture, and provides a 128K context window. In mainstream software engineering and mathematical reasoning evaluations such as SWE-bench Pro, the model has achieved a performance level comparable to Claude Opus 4.6, and it performs better than Claude Sonnet 4.6 in blind human evaluations. For programming scenarios, MAI has introduced the 5 billion-parameter intelligent agent programming model MAI-Code-1-Flash. The model is deeply integrated into GitHub Copilot and VS Code, delivering performance comparable to Claude Haiku at a lower inference cost. In multimodal capabilities, MAI-Image-2.5 and its Flash variant support high-precision text-to-image generation and image editing, with image quality scores surpassing Nano Banana Pro. For speech and transcription, MAI has released a SOTA (state-of-the-art) 43-language transcription model, MAI-Transcribe-1.5, which is 5 times faster than competing solutions, as well as a speech generation model, MAI-Voice-2, and its Flash variant. MAI-Voice-2 supports 15 languages, offers emotion control, and enables zero-shot cloning.

These models will not only be deployed on Azure AI Foundry, but will also be listed on OpenRouter, Fireworks, and Baseten, and for the first time will support developers in fine-tuning their own weights. Microsoft also disclosed that, through joint software-hardware co-optimization of the models with its in-house chip Maia 200, it achieved a 1.4x improvement in computational efficiency.

In addition to the release of base models, Microsoft launched a “Frontier Tuning” service based on reinforcement learning environments (RLE). This service allows enterprises to perform customized training of MAI models in a fully controlled isolated environment (“training gyms”), using their internal operational trajectories, decision sequences, and professional data. Tests show that, after Frontier Tuning, the customized models’ efficiency is significantly improved. In the case of an MAI model optimized for Excel, its performance aligns with GPT-5.4 but is 10 times more efficient. For an MAI model customized for McKinsey, it achieves the highest win rate while reducing costs by nearly 10 times.

Microsoft also announced a strategic partnership with Mayo Clinic, one of the world’s top medical institutions. The partners will jointly develop clinical reasoning large models based on Mayo’s clinical data and Microsoft’s AI foundation model. The model will be owned entirely by Mayo Clinic. It will first be deployed internally at Mayo for early diagnosis and treatment plan design, and then opened to other medical institutions through Azure AI Foundry.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 5
  • 1
  • Share
Comment
Add a comment
Add a comment
GateUser-d6fb8ff1
· 3h ago
Mustafa Suleyman jumps from Inflection to Microsoft, now turning in his first report card
View OriginalReply0
SaveABitOnGasFees
· 3h ago
Third-party distillation is not used at all; it's a purely self-developed approach, with costs and risks maximized, but the control over the discourse power is also entirely in one's own hands.
View OriginalReply0
GateUser-dce566e8
· 3h ago
Build 2026 hasn't started yet, right? Was this news leaked early?
View OriginalReply0
GateUser-de0b9e3b
· 3h ago
All seven models are fully self-developed, trained from scratch.
How much investment does that require?
View OriginalReply0
MountainShadowsBeforeTheStorm
· 3h ago
Is Microsoft trying to completely eliminate its dependence on OpenAI?
View OriginalReply0
  • Pinned