Nvidia has released Nemotron 3, the latest of the company’s open reasoning models designed to support agentic AI systems that operate across long contexts and multiple agents. The release includes three model sizes, Nano, Super, and Ultra, along with new open datasets and reinforcement learning tools to help developers build specialized AI systems for production use. Nvidia gave the following details about the size and use cases of the three models:
- Nemotron 3 Nano, a small, 30-billion-parameter model that activates up to 3 billion parameters at a time for targeted, highly efficient tasks.
- Nemotron 3 Super, a high-accuracy reasoning model with approximately 100 billion parameters and up to 10 billion active per token, for multi-agent applications.
- Nemotron 3 Ultra, a large reasoning engine with about 500 billion parameters and up to 50 billion active per token, for complex AI applications.
Nvidia originally developed Nemotron as an open foundation for building and customizing AI systems while giving developers and enterprises transparent access to models, data, and training techniques they could inspect and adjust. The project was designed to support reasoning capabilities and domain-specific specialization, allowing organizations to adapt models to their own data, workflows, and regulatory constraints. Nvidia releases Nemotron model weights along with datasets, numerical precision methods, and software for training and inference.
The Nemotron 3 models are built around a hybrid mixture-of-experts architecture that activates only a fraction of total parameters per token, allowing the models to deliver higher throughput and lower inference cost than earlier versions. Nemotron 3 Nano, available now, is a 30-billion-parameter model with roughly 3 billion active parameters at a time. Nvidia says it delivers up to four times higher token throughput than Nemotron 2 Nano and supports a context window of up to one million tokens. The larger Super and Ultra models, expected in the first half of 2026, scale the same architecture to higher accuracy use cases involving more agents and longer reasoning chains.
Along with the models, Nvidia is releasing three trillion tokens of new pretraining, post-training, and reinforcement learning data, as well as the open source libraries NeMo Gym and NeMo RL. These tools provide training environments and evaluation frameworks that Nvidia used internally to build Nemotron 3 and were made to allow for greater visibility and control in how the models are trained and evaluated.
In a press briefing, Kari Briski, Nvidia’s VP of Generative AI Software for enterprise, was asked if Nvidia is looking to become a frontier model builder and compete with proprietary models.
“We don’t have to compete with proprietary models,” she said, noting that Nvidia has built Nemotron for the open developer ecosystem and also to “really push the limits of our systems for both training and inference so that we know that we are building the best systems for all of our partners. I wouldn’t say it is competing. It’s building it for ourselves and giving it to the ecosystem to trust and develop on top of.”
One may wonder how developers could use Nemotron if they are already using proprietary models to build agents. Briski said Nvidia does not expect Nemotron to replace proprietary models already in use. Instead, she described an emerging pattern where developers are building agentic systems using multiple models that evolve over time. Teams may start with a single model, but as they collect domain data and refine their applications, they increasingly fine-tune open models for specific tasks and route this work across specialized agents. In this scenario, open models like Nemotron are used to optimize parts of an application where efficiency and control matter most, while proprietary models continue to be used where their strengths apply.
“It’s not that we’re seeing replacement; we’re seeing growth,” she said.
Nvidia says early adopters of Nemotron include companies like CrowdStrike, ServiceNow, and Perplexity, which are integrating the models into their AI platforms. In AI search engine Perplexity’s case, the company uses an agent routing system that dynamically directs each query to the model best suited for the task. Nvidia said Perplexity uses Nemotron’s open models and related technologies, along with other open and proprietary models, allowing the platform to optimize queries for accuracy, efficiency, and cost.
As shown in that example, Nvidia is betting that agentic AI systems will rely on multiple models working together, rather than a single foundation model. As reasoning workloads grow more complex and inference costs rise, the company is positioning its open models, open datasets, and efficient architectures as essential building blocks for production systems looking to balance cost and control.
The post Nvidia Releases Nemotron 3, Expanding Its Open Models for Agentic AI appeared first on AIwire.


