NVIDIA released Nemotron 3 Nano Omni, an open-source multimodal AI model combining vision, audio, image and text processing in a single system. The 30B-A3B hybrid mixture-of-experts model achieves 9x higher throughput than comparable open multimodal models while maintaining leading accuracy across document intelligence, video and audio tasks. The model is now available for enterprises and developers building efficient agentic systems with deployment flexibility across local, data center and cloud environments.
Models
NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents
NVIDIA's open-source Nemotron 3 Nano Omni unifies vision, audio, and language in a single 30B-parameter system, achieving 9x higher throughput than comparable multimodal models for efficient agentic AI.
Tuesday, April 28, 2026 12:00 PM UTC2 MIN READSOURCE: NVIDIA AI BlogBY sys://pipeline
Tags
models
/// RELATED