NVIDIA launches a family of open models for agentic artificial intelligence

Nemotron 3 line – including Nano, Super and Ultra – delivers leading performance for multi-agent AI systems, combining advanced inference, conversation and collaboration capabilities. The models leverage a hybrid Mamba-Transformer architecture of experts (MoE), delivering best-in-class inference throughput while supporting contexts up to 1 million tokens in length.

Nemotron 3 Nano, the smallest model, is optimized for cost-effective inference and tasks such as software debugging, content summarization, AI assistant workflows, and information retrieval. Despite having 30 billion total parameters, it only intelligently activates about 3 billion per token. Thanks to its unique hybrid MoE design, Nano achieves up to 4x greater token throughput than its predecessor and reduces inference token generation by 60%, all while maintaining the highest accuracy. Early benchmarks have shown that Nano outperforms comparable open models such as GPT-OSS-20B and Qwen3-30B on reasoning and long-context tasks.

Nemotron 3 Super and Ultra expand these capabilities for large collaborative agents and complex AI applications, including innovations such as implicit MoE, hardware-aware expert design that increases model quality without sacrificing performance, and multiple token prediction (MTP) that improves long text generation and multi-step reasoning. Both larger models are trained using NVIDIA's NVFP4 format, allowing for faster training and reduced memory requirements.

All Nemotron 3 models undergo post-training using multi-environment reinforcement learning (RL), which enables them to perform tasks including mathematical and scientific reasoning, competitive coding, following instructions, software engineering, chat and using multi-agent tools. The models also support granular budget control based on reasoning at the point of inference, enabling developers to fine-tune compute resources while maintaining accuracy.

NVIDIA has also released a comprehensive set of datasetstraining libraries and evaluation tools, including over three trillion tokens of pre-training and reinforcement learning data, the NeMo Gym and NeMo RL open source libraries, and the Nemotron Agentic Safety Dataset for real-world safety assessment.

The Nemotron 3 family is designed to enable developers, startups and enterprises to create specialized AI agents transparently and efficiently. Nano is now available through Hugging Face, NVIDIA NIM microservices, and major cloud and AI platforms including AWS, Google Cloud, and Microsoft Foundry. Super and Ultra are scheduled to premiere in the first half of 2026.

Early adopters such as Accenture, ServiceNow, Perplexity and Palantir are already integrating Nemotron 3 models into AI workflows for manufacturing, cybersecurity, software development, media and enterprise operations.

With Nemotron 3, NVIDIA is working towards a new standard for efficient, accurate and open artificial intelligence models. This will enable developers to scale agent-based AI applications from prototype to enterprise deployment while maintaining transparency, cost-effectiveness, and state-of-the-art performance.

LEAVE A REPLY

Please enter your comment!
Please enter your name here