Microsoft Build 2026 Developer Conference: The "Agent First" Era Arrives, Launching Seven Proprietary Models in One Go

By Li Hailun

Edited by Xu Qingyang

On June 2 in the United States (local time), Microsoft’s Build 2026 Developer Conference kicked off at Masonberg in San Francisco. The conference theme focused on practical applications of cutting-edge AI technology. Microsoft released a series of products and updates spanning its self-developed AI models, intelligent agent applications, operating system security, developer tools, cloud services, and a new hardware platform.

At the 2025 Developer Conference, Microsoft set the direction of the “AI agent era,” launching Copilot Studio multi-agent orchestration, Windows AI Foundry, and announcing full support for the Model Context Protocol. GitHub Copilot also introduced a coding intelligent agent called Coding Agent.

In Microsoft’s narrative, 2025 was about “what standards and frameworks to use in the agent era,” while 2026 focuses on “how to truly get things running with our own models and products.” The model layer has filled in with its self-developed main models that can carry the weight, and the product layer has pushed agents from demonstrations into full-stack deployments across systems, hardware, and cloud.

At this press event, the core releases can be divided into six sections: the MAI self-developed model family; an intelligent agent ecosystem represented by Scout and GitHub Copilot applications; the Windows system-level AI security sandbox MXC; Surface RTX Spark Dev Box for developers and system optimizations; Project Solara’s new intelligent agent device platform; and developer tools and governance frameworks including Microsoft IQ, Rayfin, ASSERT, ACS, and more.

01 Seven models trained from scratch, rejecting distillation

The entire keynote unfolded around a vision statement from Microsoft CEO Satya Nadella. After he proposed the “agent-first” strategic framework, executives from each business line took the stage in sequence, releasing specific products to put the framework into practice.

At the conference, Suleiman announced the launch of seven new models developed internally by Microsoft AI, all unified under the MAI family.

He described MAI’s mission as building a “climbing machine.” Through continuous investment in compute, better data, and more precise evaluation, it achieves loop-after-loop self-improvement, keeping users at the forefront of technology.

Regarding training compute scale, Suleiman said that the compute used to train frontier models has grown by one trillion times, and is expected to grow another one thousand times in the next three years. All MAI models at Microsoft are “trained from zero, climbing from scratch with zero distillation,” without relying on third-party model outputs for training.

Microsoft AI division head Suleiman introduces seven self-developed models

The specific models are as follows:

Flagship reasoning model MAI-Thinking-1 — a medium-sized model. Microsoft says that in key software engineering tests, its performance can match the best models on the market. In blind comparative tests, human evaluators’ preference for it is about on par with Sonnet 4.6. This model is trained from scratch with clean data and does not use third-party model distillation.

Programming model MAI-Code-1-Flash — an inference-efficient agentic coding model with 5 billion parameters. It is tailored for and deeply integrated with GitHub Copilot, VS Code, and Microsoft’s technology stack. Microsoft states it can be comparable to Haiku but at a lower cost.

Text-to-image model MAI-Image-2.5 and its ultra-efficient Flash variant. They support text-to-image and image editing. Microsoft claims that in Arena scoring, they surpass Google’s Nano Banana Pro.

Transcription model MAI-Transcribe-1.5 — with SOTA-level accuracy. It is claimed to be five times faster than competing models, with built-in domain-specific terminology recognition for 43 languages.

Speech generation model MAI-Voice-2 — providing high-quality, naturally sounding speech generation. It supports 15 languages and can adapt voices from short samples, with anti-misuse protection measures. Its Flash variant will be released soon, achieving the same functionality at lower cost.

All models share the same data specifications, infrastructure, and evaluation framework. In addition to being distributed on Azure Foundry and optimized for Microsoft’s first-party products, these models will also be provided to developers on Open Router and Fireworks and Baseten. For the first time, developers will be able to adjust model weights themselves.

At the event, Nadella introduced Microsoft Frontier Tuning, a way for enterprises to customize models using their own working data. Its logic is that the most valuable data is not generic corpora, but the real traces, steps, and decisions of agents performing tasks within the enterprise.

Microsoft CEO Nadella introduces Frontier Tuning

This mechanism connects MAI models to real business workflows, enabling the model to learn as it goes in real environments. Suleiman said, “You are building your own model: you train it in your environment with your data, and under your control. Your organization’s knowledge becomes part of the model—and belongs only to you.”

In terms of outcomes, the MAI model tuned for Excel is comparable to the GPT-5.4 level, while improving efficiency by 10x. After McKinsey adopted Frontier Tuning, MAI achieved the highest win rate among all tested models, with costs reduced by about 10x.

In the healthcare domain, Microsoft announced a partnership with the Mayo Clinic to jointly build a cutting-edge AI model for healthcare. The model will combine Mayo Clinic’s clinical expertise, de-identified clinical data, and longitudinal insights with Microsoft’s foundational AI capabilities.

Microsoft also revealed that MAI models are being co-designed with its self-developed Maia 200 chip. Through joint hardware-software optimization, a 1.4x efficiency improvement has already been achieved.

02 Full deployment of the intelligent agent ecosystem

At the conference, Microsoft declared a major transformation toward “agent-first,” aiming to automate how knowledge workers use software by embedding AI assistants into everyday office interactions.

Scout is the core intelligent agent product released in this wave. This “always-on” AI Agent, built on the OpenClaw framework, can interact in Microsoft Teams like a human colleague.

Scout can browse users’ work messages, calendars, and email inboxes, automatically complete tasks, reschedule conflicting meetings, and draft replies that sound highly professional. Users can send instructions to it directly in Teams, or name it.

Omar Shahin, the newly appointed corporate vice president, explained Scout’s design philosophy: “Your company is essentially hiring your assistant. The whole point of having a personal assistant is that when you’re not at work, they’re still working.”

Scout is provided through Microsoft Frontier, which requires a GitHub Copilot subscription. Microsoft is testing a Scout desktop application, which will be rolled out to subscription users who choose to get access to “Frontier” feature permissions. Internally, Shahin said that the sales department is the largest user group for this tool and the fastest-growing one.

The GitHub Copilot desktop application is another major release. Introducing it, GitHub Chief Product Officer Mario Rodriguez described it as a “native desktop experience built on top of GitHub, with agent-native capabilities.”

Through a unified “My Work” view, developers can see dynamic work across connected repositories, including active sessions, issues, pull requests, and background automation. Each session runs in its own Git worktree, and parallel agents do not interfere with each other. The application features Agent Merge, which can guide pull requests through review, checks, and merging. The Canvas interface provides a bidirectional human-AI interaction layer, allowing developers to inspect, guide, and verify the work performed by agents on their behalf.

The GitHub Copilot application offers a technical preview for Windows 11, Windows 11 on Arm, Mac and Linux. It requires a GitHub Copilot subscription, and will be opened to Copilot Free users in the future. The application supports cloud and local sandboxes, and code review—both come with policy support.

For intelligent agent security governance, Microsoft released Agent Control Specification (ACS). This is a new open-source standard designed to give developers a more consistent, more granular approach to controlling AI agent behavior. With ACS, development, compliance, and security teams can define policy files for agents, specifying what agents may do, what they absolutely must not do, when human approval is required, and what evidence must be recorded for review.

ACS is released as an SDK and comes with plugins such as LangChain, OpenAI Agents SDK, Anthropic Agents SDK, AutoGen, CrewAI, Semantic Kernel, Microsoft.Extensions.AI, and MCP tools. Because policies can be written as a single file, they can be bundled with an agent and follow it across different frameworks and environments.

ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing) is another testing tool. It is an open-source framework that uses AI to convert high-level natural-language descriptions of target, policy, or expected behaviors into structured scoring tests.

ASSERT takes concise natural-language descriptions of expected AI model behavior, generates sets of acceptable and unacceptable behaviors, problem scenarios, and test cases. It then runs tests against the target system and scores the results. It also records the path taken by the AI system, including intermediate operations and tool calls, so developers can inspect where failures occurred.

03 The more autonomous an agent is, the more dangerous it is—Microsoft draws system-level red lines with MXC

As AI agents become more powerful and more autonomous, Microsoft identified a critical problem: the more autonomous and useful an agent is, the more dangerous it becomes to let it run on enterprise networks without guardrails. Microsoft’s official blog describes this as a “multi-layer systems problem.” Every interaction between agents and humans, tools, applications, models, and other agents “creates new attack surfaces and introduces different failure modes.”

To address this, Microsoft introduced Microsoft Execution Containers (MXC), a policy-driven execution layer built into the Windows operating system itself. Pavan Dwarurri, Vice President of Windows and Devices execution, emphasized that this is essential to make AI agents commercially viable. It enables agents to be “centered on security, containment, isolation, and user control,” making them sufficiently safe for deployment to both ordinary consumers and enterprises.

Microsoft CEO Nadella introduces MXC system-level security sandbox

MXC is essentially an SDK and policy model embedded in Windows and Windows Subsystem for Linux, providing what Microsoft calls a “composable sandbox spectrum.” This spectrum ranges from lightweight process isolation (already adopted by GitHub Copilot’s command-line interface) to micro virtual machines, Linux containers, and full cloud instances running on Windows 365.

This system separates the execution of an agent from the user’s desktop, clipboard, user interface, and input devices. Each agent is bound to an identity—either a local ID or a cloud-provisioned identity supported by Microsoft Entra—ensuring that every action by the agent can be attributed, audited, and governed.

MXC is now available as an early preview. Agent 365, integrated with Microsoft’s enterprise security stack, will launch a preview in July 2026. It layers Entra identity services, Intune device management, Defender threat protection, and Purview data governance on top of MXC, enabling IT departments to centrally manage agent isolation.

In terms of partners, OpenAI, NVIDIA, Manus, Nous Research (Hermes Agent manufacturer), and the OpenClaw open-source project have announced building on MXC.

Notably, the collaboration with OpenClaw began when its creator Peter Steinberger proactively reached out to Microsoft to express interest. Ultimately, it evolved into a comprehensive platform-level partnership.

04 Three updates: keep Edge’s AI running even without being online

Microsoft Edge browser has also received upgrades to local AI capabilities. Microsoft says that since the introduction of Phi-4-mini at Build 2025, the team has expanded on-device AI capabilities based on feedback from web developers.

The first update is Aion-1.0-Instruct, a smaller, faster, and more efficient local small language model than Phi-4-mini. It can run on PCs with weaker GPU and CPU capabilities. It is currently offered as a developer preview and will land on Hugging Face in July.

The second is the language detection and translation API, provided with the Edge 148 release. These two APIs are powered by Edge’s built-in on-device AI models for JavaScript. They allow websites and browser extensions to identify the language of text and translate between language pairs. Microsoft calls it “fast, high-quality translation supporting more than 145 languages, optimized for translation workloads on the web.” This service is free.

The third update is speech recognition via the Web Speech API, offered experimentally in the Edge Canary and Dev channels. This API helps developers integrate speech or audio input into websites and browser extensions, running locally on the device. It also supports cloud-backed speech-to-text and text-to-speech services as a fallback.

05 Iterations on developer tools and cloud services

On the data intelligence layer, Microsoft released Microsoft IQ, combining the previously separate four context sources into a shared foundation for agents.

Microsoft Fabric CTO Amir Nezati offered an analogy: the green-code waterfall in The Matrix isn’t decoration—it’s the foundation of that world. He said, “What we’re doing in the data world is building a data-based reality for agents.”

Microsoft IQ’s four context sources are: Work IQ, capturing how an organization operates day to day, using email, documents, meetings, and schedules; Foundry IQ, managing institutional knowledge, planning and indexing knowledge bases; Fabric IQ, modeling the real-time operating state of the business through data, defining entities, relationships, and business rules anchored by real-time signals based on Fabric—this feature is expected to be officially released in the coming months; and Web IQ, adding real-time global context from the web.

With this context system in place, an agent is no longer just a tool that executes commands, but a virtual employee that understands how the company runs.

Sharing a shared “foundation” alone isn’t enough. When agents start generating applications, each application needs a backend. If left unmanaged, these applications could form new data silos outside the context layer. To solve this, Microsoft released Rayfin, an open-source SDK and CLI that deploys applications built by agents directly onto the Fabric platform as governed production backends. By default, application data flows into a unified OneLake data lake, and then feeds back into Microsoft IQ rather than piling up externally.

Microsoft positions it as a competitor to Supabase and Neon. The core difference is governance: all applications go through the same set of data and compliance channels. Nezati said this is a bidirectional process: when an agent builds an application, it pulls information from the enterprise’s data rules; as the application runs, the data it generates updates those rules, so the next agent can use the latest information.

Alongside this, Microsoft also launched WSL container functionality, enabling developers to create and manage Linux containers directly on Windows. Microsoft also provided a command-line interface and APIs, allowing Linux containers to run inside native Windows applications. This feature will be available as a public preview in the coming months.

To avoid wasting developers’ time on environment setup, Microsoft also released Windows Developer Configurations, which lets users quickly set up a new machine and apply developer-optimized configurations. It automatically installs WSL, PowerShell 7, and Visual Studio Code, while enabling Git version control in File Explorer and showing hidden files.

06 Two new hardware devices: bring the hard AI work back to the local end

This Build was not just a software showcase of models, agents, and developer tools—hardware was also front and center. As AI computation increasingly consumes more compute power and agentic workflows need to run continuously, Microsoft turned its focus to the devices in developers’ hands. Instead of renting expensive cloud GPUs every time, why not do these tasks directly on local machines?

Andrew Hill, Vice President of Surface, announced two new devices:

Surface RTX Spark Dev Box is a compact developer PC, equipped with the NVIDIA RTX Spark superchip. It combines the NVIDIA Blackwell RTX GPU and NVIDIA Grace CPU to deliver up to 1 Petaflop of AI compute power, with 128 GB unified memory.

The device uses an aluminum chassis that also serves as a heatsink. It is designed for long-duration training tasks, large-model inference, and complex agentic workflows. It comes preloaded with Windows 11 Pro, and at the image configuration level it is preconfigured for developers: a dark theme, a simplified taskbar for development, removed widgets, “Do Not Disturb” mode enabled, developer mode enabled, and PowerShell 7 set as the default shell. WSL 2 is configured with GPU passthrough and CUDA support. VS Code, GitHub Copilot, Git, Python, and Node.js are all installed.

In terms of security, the Surface RTX Spark Dev Box is built on a chip-to-cloud security stack aligned with Microsoft’s Zero Trust principles. This includes the Secured-core PC architecture, BitLocker encryption, and Microsoft Defender protection. It can be integrated with Entra ID and Intune to enable large-scale management and governance.

Hill explained: “The way developers build software is undergoing a fundamental change. As AI model capabilities and complexity continue to grow, agentic workflows require sustained compute power. And even for those tasks that don’t require the most advanced models, each iteration may still incur cloud costs.”

The other device, Surface Laptop Ultra, is a high-performance notebook designed specifically for developers, creators, and technical professionals. It was launched earlier. Together, the two represent Surface’s next step: dedicated devices for people building the future. The Surface RTX Spark Dev Box will be available in the United States later this year, sold exclusively on Microsoft.com.

07 A new platform so devices run AI agents rather than applications

Stevie Bathish, head of Microsoft Applied Sciences, introduced an internal project called Project Solara.

This is a new platform from chip to cloud, based on Android rather than Windows, aiming to let devices run AI agents instead of applications. Bathish explained its starting point: “The boundaries are collapsing. You don’t necessarily need the traditional app model. You don’t need the traditional ways to develop an experience.”

The first two concept devices were shown at Build:

A desktop hub device placed beside a PC. It responds to voice commands, logs in users via facial recognition, and presents the most urgent matters of the day. When connected to a display, it can transform into a complete Windows machine running in the cloud.

A wearable badge device that reimagines the standard employee ID card. With one press of a finger print, it wakes the agent. A light touch can record and transcribe conversations, and a built-in camera allows the agent to take actions based on what the user sees.

In a healthcare demo, this badge ran an agent designed for medical staff. It could scan patient QR codes, record and transcribe visit processes, record vital signs, and issue prescriptions. In another application, the built-in camera scanned a brainstorm board with office renovation ideas and proposed adding greenery.

Bathish said Microsoft will not produce these devices itself. Instead, Microsoft envisions hardware manufacturers and other industry partners turning these reference designs into their own products—each tailored to a specific industry, company, or scenario.

08 Quantum chip upgrade: reliability improved by 1000x

Microsoft also released the next-generation topological quantum chip, Majorana 2.

Compared with the previous Majorana 1, the core change this time is that the superconducting material changes from aluminum to lead. This adjustment boosts qubit reliability by 1000 times. The average qubit lifetime reaches 20 seconds, and some instances can last up to one minute.

Quantum bits using other technical routes typically achieve lifetimes only in the microsecond range. Based on this progress, Microsoft has cut by half the expected timeline for realizing scalable quantum computers. Microsoft currently expects it to be achieved before 2029.

The development of this chip used Microsoft Discovery’s agentic AI capabilities throughout. AI agents took on tasks such as manufacturing management, automated measurement of quantum states, and cross-disciplinary data analysis. This compressed measurement cycles that used to take weeks into a timeframe of days, and helped identify correlations that humans would have difficulty noticing from nearly two decades of accumulated data.

Microsoft Fellow Chetan Nayak said, “Agentic AI is almost embedded in everything we do.” But he emphasized that AI only provides guidance—“it’s always scientists in the loop.”

The Microsoft Discovery platform was also officially launched at this conference. It is an organization-level platform for frontier research, allowing researchers to deploy autonomous agent teams guided by humans to conduct hypothesis generation, experiment optimization, and theoretical validation. Microsoft also released an early preview of the Microsoft Discovery app, which individuals can download for free and run locally using a GitHub Copilot account.

Jin Lu, a special compiler, also contributed to this article.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned