Microsoft Expands Foundry With Seven In-House MAI Models


TL;DR

  • Foundry Rollout: Microsoft has added seven in-house MAI models across reasoning, code, image, voice, and transcription workflows.
  • Reasoning Model: MAI-Thinking-1 is in private preview with 35 billion active parameters and a 256K-token context window.
  • Customer Control: Microsoft says developers will be able to tune model weights, a deeper option than prompt engineering alone.
  • Competitive Context: Google and Anthropic have recently moved Gemini and Claude models into similar developer-focused workflows.

Microsoft has launched its in-house MAI model family into Foundry across reasoning and multimodal workloads, turning the Build 2026 update into a broader developer push rather than a single-model release. MAI-Thinking-1 is the flagship reasoning system, while additional code, image, and transcription models extend the lineup into everyday developer and enterprise workflows.

Developers and enterprise teams get limited Foundry access first as Microsoft moves more first-party systems toward customer testing. Foundry remains Microsoft’s platform for finding, deploying, and governing AI models, but the wider MAI rollout puts Microsoft-owned models into the same decision path where customers already compare OpenAI, Anthropic, Google, and specialist speech or coding tools.

MAI-Thinking-1 Leads the Foundry Push

Microsoft’s Foundry rollout covers AI systems for Microsoft Foundry across reasoning, image, voice, and speech. Microsoft AI also launched seven new MAI models across image, voice, transcription, coding, and reasoning, with MAI-Thinking-1 in private preview for Foundry users and a MAI Playground public preview planned later.

MAI-Thinking-1 carries the central technical role in the rollout. A sparse Mixture-of-Experts design routes each task to selected expert subnetworks rather than activating the whole system, which can preserve overall model capacity while limiting the compute used for a given request.

Its specification lists 35 billion active parameters, roughly 1 trillion total parameters, a 256K-token context window, function calling, developer instructions, and compatibility with the Chat Completions API. A 256K-token window gives the model room to consider longer codebases, documents, or instructions in one prompt, while Chat Completions compatibility reduces integration work for teams that already use that API pattern.