2026.06.02 21:09

🚨 do you understand what just happened to Microsoft.. Microsoft just dropped seven of its own MAI models, trained from scratch with zero distillation, and said its custom-tuned models already match GPT-tier quality at 10x lower cost. The partner that pays OpenAI's bills is now quietly building the thing that replaces it. - MAI-Thinking-1 hits human-preference parity with Sonnet 4.6 and 53% on SWE Bench Pro, right next to Opus 4.6 - MAI-Code-1-Flash delivers Haiku-class coding at just 5B params and is already shipping inside GitHub Copilot - A custom MAI model tuned for Excel matched GPT-5.4 while running up to 10x more efficiently - Their real weapon isn't the weights, it's that they're already inside everyone's Office, Teams, and VS Code The hard part was never training a good model. The hard part is being the room everyone already works in. Microsoft owns the room.

显示更多

Mustafa Suleyman@mustafasuleyman

2026.06.02 18:38

Super excited to announce seven new world-class MAI models today. They represent what we consider a new era in AI designed to keep you in control and on the frontier. First is our text foundation model, MAI-Thinking-1, exceptionally strong on reasoning and SWE tasks. - It’s a 35B active parameter MoE with a 256K context window. Independent human raters on Surge prefer it for overall quality in blind side-by-sides versus Sonnet 4.6, and it’s achieved 97% on AIME 2025, the key measure of its general-purpose reasoning abilities. - It's at 53% on SWE Bench Pro, placing it right alongside Opus 4.6 on one of the toughest coding benchmarks. - And since we co-designed our models with our own silicon, MAI-Thinking-1 is optimized on our MAIA 200 chip. Benchmarking head-to-head against the GB200, we see 30% better performance per dollar as well as a 1.4x performance-per-watt gain when running our MAI models on the MAIA 200 end-to-end. Next is MAI-Image-2.5 and its Flash variant. Two super strong models now at #2# on the leaderboards, surpassing the score of Nano Banana 2 on image editing. Last for now is MAI-Code-1-Flash, our new inference efficient coding model, especially tuned for VS Code and GitHub Copilot CLI. - Code-1-Flash achieves 51% on SWE Bench Pro, despite having just 5B parameters, putting it closer to Haiku in size but cheaper in cost. All of this is the foundation for Microsoft Frontier Tuning. It lets you customize our models to create custom, company-specific agents that only you control. You can make our model, your model. Your data. Your agents. Your moat. Early adopters are already seeing a difference. When we tuned our models for McKinsey’s tasks, MAI delivered the highest win rate, outperforming GPT-5.5 on quality, while being 10x lower on cost. Also really excited to be collaborating with the amazing team at Mayo Clinic to jointly train a new frontier AI model for healthcare. Our announcements today mark another milestone on the road to humanist superintelligence. You can learn more and about our other new models in our latest blog:

显示更多