AI Inference Systems Engineer NVIDIA Careers
Location
toronto, on
Job Type
Full-time
Category
Other-General
Posted
May 27, 2026
Join NVIDIA as an AI Inference Systems Engineer to revolutionize model efficiency and scalability. Focus on optimizing GPU performance while collaborating with top-tier teams in AI development.
In this role, you'll leverage your extensive programming background to build high-performance inference frameworks and tools, lead optimizations in GPU kernels, and benchmark methodologies. Your role is crucial in shaping the future of ML Systems and improving deployment across clouds.
Key Responsibilities:
• Enhance vLLM with the latest NVIDIA GPU features
• Benchmark and optimize GPU kernels for peak performance
• Contribute to industry-leading MLPerf Inference submissions
• Architect scheduling for large-scale GPU inference
• Push boundaries of ML research and system integration
Requirements:
• Master’s in CS/CE/SE with 5+ years of experience
• Strong skills in Python, C/C++, and ML frameworks
• Familiar with CUDA and GPU performance tools
• Experience in c...
In this role, you'll leverage your extensive programming background to build high-performance inference frameworks and tools, lead optimizations in GPU kernels, and benchmark methodologies. Your role is crucial in shaping the future of ML Systems and improving deployment across clouds.
Key Responsibilities:
• Enhance vLLM with the latest NVIDIA GPU features
• Benchmark and optimize GPU kernels for peak performance
• Contribute to industry-leading MLPerf Inference submissions
• Architect scheduling for large-scale GPU inference
• Push boundaries of ML research and system integration
Requirements:
• Master’s in CS/CE/SE with 5+ years of experience
• Strong skills in Python, C/C++, and ML frameworks
• Familiar with CUDA and GPU performance tools
• Experience in c...