NVIDIA Nemotron 3 Nano 30B MoE Model Now Available on Amazon SageMaker Jumpstart

Today we are excited to announce that nvidia nemotron The 3Nano 30B model with 3B active parameters is now generally available in the Amazon SageMaker JumpStart Model Catalog. You can accelerate innovation and deliver tangible business value with Nemotron 3 Nano on Amazon Web Services (AWS) without managing model deployment complexities. You can empower your generative AI applications with Nemotron capabilities using the managed deployment capabilities offered by SageMaker JumpStart.

Nemotron 3 is a small language hybrid blend of nano experts (MOE) models with the highest computation efficiency and accuracy for developers to run highly efficient agentic tasks at scale. The model is completely open with open-source weights, datasets, and recipes, so developers can seamlessly adapt, customize, and deploy the model on their infrastructure to help meet their privacy and security needs. Nemotron 3 Nano excels at coding and reasoning, and leads on benchmarks such as SWE Bench Verified, GPQA Diamond, AIME 2025, Arena Hard v2, and IFbench.

About Nemotron 3 Nano 30B

The Nemotron 3 Nano stands out from other models due to its architecture and precision, boasting strong performance across a variety of high-tech skills:

architecture:
- ο MoE with hybrid Transformer-Mamba architecture supports token budgeting to provide optimal accuracy with minimal logic token generation
accuracy:
- Leading accuracy in coding, scientific reasoning, mathematics, and following instructions
- Leading on benchmarks like LiveCodeBench, GPQA Diamond, AIME 2025, BFCL and IFBench (compared to other open language models under $30B)
Applicability:
- 30B parameter model with 3 billion active parameters
- Has a reference window of up to 1 million tokens
- Text-based foundation model using text for both input and output

Prerequisites

To get started with Nemotron 3 Nano in Amazon SageMaker JumpStart, you must have a provisioned Amazon SageMaker Studio domain.

Get started with NVIDIA Nemotron 3 Nano 30b in SageMaker JumpStart

To test the Nemotron 3 Nano model in SageMaker JumpStart, open SageMaker Studio and select model In the navigation pane. Search for and select NVIDIA in the search bar nvidia nemotron 3 nano 30b As a model.

On the model details page, select deploy And follow the prompts to deploy the model.

After deploying the model to a SageMaker AI endpoint, you can test it. You can access the model using the following AWS Command Line Interface (AWS CLI) code examples. you can use nvidia/nemotron-3-nano As model ID.

cat > input.json << EOF
{
"model": "${MODEL_ID}",
"messages": (
{
 	"role": "system",
 	"content": "You are a helpful assistant."
 },
 {
 	"role": "user",
       	"content": "What is NVIDIA? Answer in 2-3 sentences."
}),
"max_tokens": 512,
"temperature": 0.2,
"stream": False, # Set to False for non-streaming mode,
   	"chat_template_kwargs": {"enable_thinking": False} # Set to False for non-reasoning mode
}
EOF
 
aws sagemaker-runtime invoke-endpoint 
--endpoint-name ${ENDPOINT_NAME} 
--region ${AWS_REGION} 
--content-type 'application/json' 
--body fileb://input.json 
> response.json

Alternatively, you can access the model using the SageMaker SDK and Boto3 code. The following Python code examples show how to send a text message to an NVIDIA Nemotron 3 Nano 30B using the SageMaker SDK. For additional code examples, see NVIDIA GitHub Repo.

runtime_client = boto3.client('sagemaker-runtime', region_name=region) 
payload = {
        "messages": (
            {"role": "user", "content": prompt}
        ),
        "max_tokens": 1000
    }
    
    try:
        response = self.runtime_client.invoke_endpoint(
            EndpointName=self.endpoint_name,
            ContentType="application/json",
            Body=json.dumps(payload)
        )
        
        response_body = response('Body').read().decode('utf-8')
        raw_response = json.loads(response_body)
        
        # Parse the response using our custom parser
        return self.parse_response(raw_response)
        
    except Exception as e:
        raise Exception(
            f"Failed to invoke endpoint '{self.endpoint_name}': {str(e)}. "
            f"Check that the endpoint is InService and you have least-privileged IAM permissions assigned."
        )

now available

NVIDIA Nemotron 3 Nano is now available fully managed in SageMaker JumpStart. See model packages for AWS region availability. To know more, check this out Nemotron Nano Model PageThe NVIDIA GitHub Sample Notebook for Nemotron 3 Nano 30BAnd the Amazon SageMaker JumpStart pricing page.

Try the Nemotron 3 Nano model today in the Amazon SageMaker JumpStart and send feedback AWS Re:Post for SageMaker Jumpstart Or through your usual AWS support contacts.

About the authors

dan ferguson is a Solutions Architect at AWS based in New York, United States. As a Machine Learning Services Specialist, Dan works to assist customers in their journey to integrate ML workflows efficiently, effectively, and sustainably.

Pooja Karadagi Amazon leads product and strategic partnerships for SageMaker JumpStart, the machine learning and generative AI hub within SageMaker. He Foundation is dedicated to accelerating customer AI adoption by simplifying model discovery and deployment, enabling customers to build production-ready generic AI applications across the entire model lifecycle, from onboarding and optimization to deployment.

benjamin crabtree is a senior software engineer on the Amazon Sagemaker AI team, who specializes in providing “last mile” experiences to customers. He is passionate about democratizing the latest artificial intelligence breakthroughs by offering easy-to-use capabilities. Additionally, Ben is highly experienced in building large-scale machine learning infrastructures.

timothy ma He is a leading expert in generative AI at AWS, where he collaborates with customers to design and deploy cutting-edge machine learning solutions. He also leads go-to-market strategies for Generative AI services, helping organizations harness the potential of advanced AI technologies.

Abdullahi Olaoye is a Senior AI Solutions Architect at NVIDIA, specializing in integrating NVIDIA AI libraries, frameworks, and products with cloud AI services and open-source tools to optimize AI model deployment, inference, and generative AI workflows. He collaborates with AWS to increase AI workload performance and adoption of NVIDIA-powered AI and Generative AI solutions.

Nirmal Kumar Juluru NVIDIA has a product marketing manager who drives adoption of AI software, models, and APIs in the NVIDIA NGC Catalog and NVIDIA AI Foundation models and endpoints. He previously worked as a software developer. Nirmal holds an MBA from Carnegie Mellon University and a bachelor’s degree in Computer Science from BITS Pilani.

Vivian Chen is a Deep Learning Solutions Architect at NVIDIA, where she helps teams bridge the gap between complex AI research and real-world performance. Specializing in inference optimization and cloud-integrated AI solutions, Vivian focuses on transforming the heavy lifting of machine learning into fast, scalable applications. He is excited to help customers navigate NVIDIA’s accelerated computing stack to ensure that their models not only work in the lab, but also thrive in production.

NVIDIA Nemotron 3 Nano 30B MoE Model Now Available on Amazon SageMaker Jumpstart

About Nemotron 3 Nano 30B

Prerequisites

Get started with NVIDIA Nemotron 3 Nano 30b in SageMaker JumpStart

now available

About the authors

Pam Bondi is being criticized for her handling of the Epstein files in congressional hearings.

Two more XAI co-founders among those leaving after SpaceX merger

Related Articles

Leave a Comment Cancel Reply