Site Reliability Engineer (HPC)

Microsoft Corporation • Multiple Locations, United States, United States • Posted June 27, 2026

Location Multiple Locations, United States
Job Type Full-time
Category other-general
Posted June 27, 2026
**Overview**

As Microsoft continues to push the boundaries of AI, we are on the lookout for passionate individuals to work with us on the most interesting and challenging AI questions of our time. Our vision is bold and broad — to build systems that have true artificial intelligence across agents, applications, services, and infrastructure. It’s also inclusive: we aim to make AI accessible to all — consumers, businesses, developers — so that everyone can realize its benefits.

We’re looking for an experienced HPC **Site Reliability Engineer (SRE)** to join our High Performance Computing (HPC) infrastructure team. In this role, you’ll blend software engineering and systems engineering to keep our large-scale distributed AI infrastructure reliable and efficient. You’ll ensure that AI systems stay efficient and reliable with very high uptimes.

**Microsoft Superintelligence Team**

Microsoft Superintelligence team’s mission is to empower every person and...

Interested in this role?

Click the button below to start your application.

Apply Now