Senior HPC Site Reliability Engineer
Location
Yokneam, Israel
Job Type
Full-time
Category
other-general
Posted
May 27, 2026
We are now looking for a Senior HPC Site Reliability Engineer to join our mission and continue improving our HPC infrastructure. A meaningful part of NVIDIA’s strength is our unique and advanced development tools and environments that enable our incredible pace of innovation. We are looking for architects to help us evolve the way our private compute cloud is architected and optimized.
What you will be doing:
+ Provide leadership in the design and implementation of our large-scale compute cloud that enables the world's top chip modelers, designers, and deep learning experts to invent groundbreaking technology.
+ Identify architectural changes or completely innovative approaches in our cloud architecture and design.
+ Help with strategic challenges we encounter, including: effective resource utilization in a heterogeneous compute environment, evolving our private/public cloud strategy, capacity modeling, and planning for multi-year growth and scaling acr...
What you will be doing:
+ Provide leadership in the design and implementation of our large-scale compute cloud that enables the world's top chip modelers, designers, and deep learning experts to invent groundbreaking technology.
+ Identify architectural changes or completely innovative approaches in our cloud architecture and design.
+ Help with strategic challenges we encounter, including: effective resource utilization in a heterogeneous compute environment, evolving our private/public cloud strategy, capacity modeling, and planning for multi-year growth and scaling acr...