Microsoft has introduced the Maia 200, a next-generation AI inference chipset designed to power large AI models including OpenAI’s GPT-5.2.
The chip is optimized for low-precision compute formats like FP4 and FP8, delivering high throughput and energy-efficient performance for real-time AI tasks.
Maia 200 boasts over 10 petaFLOPS (FP4) and 5 petaFLOPS (FP8) performance, supported by advanced memory and data-movement architecture.
It includes 216GB of high-bandwidth memory (HBM3e) with 7 TB/s throughput and 272 MB of on-chip SRAM to reduce data bottlenecks.
Maia 200 is already deployed in Microsoft data centers and will support services like GPT-5.2, Microsoft 365 Copilot, and Azure AI workloads.
Microsoft claims the Maia 200 delivers 30% better performance per dollar and outperforms comparable chips from cloud rivals, targeting cost-efficient large-model inference.