Multimodal, Made Easy

Microsoft has rolled out a major upgrade to Azure AI Foundry, introducing new multimodal models that let developers move beyond text and build richer, more interactive AI experiences. With the addition of GPT-image-1-mini, GPT-realtime-mini, and GPT-audio-mini, along with enhanced safety layers in GPT-5, Azure now enables faster, more scalable development across text, images, audio, and real-time interactions; all in one unified environment.

What’s New

GPT-image-1-mini delivers lightweight, fast image generation, ideal for design, prototyping, and creative workflows.
GPT-realtime-mini and GPT-audio-mini unlock low-latency, cost-effective voice and audio capabilities for assistants, translation tools, and media applications.
GPT-5-chat-latest includes upgraded safety features for more responsible, aligned conversations.
GPT-5-pro brings advanced reasoning for research, analytics, and decision support.

Why It Matters
These updates bring developers closer to building true multimodal systems—AI that can see, listen, speak, and reason. Whether you’re developing real-time voice agents, creative image tools, or analytical engines, Azure AI Foundry now offers a single, scalable platform to bring those ideas to life.

Microsoft has also announced Sora 2, coming soon with advanced video and audio generation capabilities.

This marks another strong step toward accessible, enterprise-ready AI innovation.

Related Press Releases

« Older Entries

Multimodal, Made Easy

Related Press Releases

Taxonomy 5.0

Network Control Upgrade

FinOps 2026 Evolution