AI Inference Systems Engineer NVIDIA Careers

NVIDIA • toronto, on, Canada • Posted May 27, 2026

Location toronto, on
Job Type Full-time
Category Other-General
Posted May 27, 2026
Join NVIDIA as an AI Inference Systems Engineer to revolutionize model efficiency and scalability. Focus on optimizing GPU performance while collaborating with top-tier teams in AI development.
In this role, you'll leverage your extensive programming background to build high-performance inference frameworks and tools, lead optimizations in GPU kernels, and benchmark methodologies. Your role is crucial in shaping the future of ML Systems and improving deployment across clouds.
Key Responsibilities:
• Enhance vLLM with the latest NVIDIA GPU features
• Benchmark and optimize GPU kernels for peak performance
• Contribute to industry-leading MLPerf Inference submissions
• Architect scheduling for large-scale GPU inference
• Push boundaries of ML research and system integration
Requirements:
• Master’s in CS/CE/SE with 5+ years of experience
• Strong skills in Python, C/C++, and ML frameworks
• Familiar with CUDA and GPU performance tools
• Experience in c...

Interested in this role?

Click the button below to start your application.

Apply Now