Senior Engineer for GPU Inference at NVIDIA (Toronto)

NVIDIA • toronto, on, Canada • Posted May 29, 2026

Location toronto, on
Job Type Full-time
Category Other-General
Posted May 29, 2026

Job Overview


Join NVIDIA as a Senior Engineer and build cutting-edge AI inference systems that serve large-scale models with astounding efficiency. Focus on optimizing GPU performance and collaborating with top experts.


In this pivotal role, you will have the opportunity to architect high-performance inference stacks and optimize NVIDIA's GPU solutions for maximum productivity. Your expertise will be instrumental in achieving industry-leading benchmarks and implementing state-of-the-art GPU kernels within a collaborative, multi-cloud framework.


Leverage your skills in performance engineering at NVIDIA to drive AI innovation.


Key Responsibilities



  • Develop and optimize features for vLLM with latest GPU tech

  • Benchmark and profile GPU kernels for efficiency

  • Create tools for inference benchmarking methodologies

  • Lead orchestration of large-scale inference deployme...

Interested in this role?

Click the button below to start your application.

Apply Now