Site Reliability Engineer
About the JobAn opportunity to grow your SRE craft in a fast-paced, collaborative environment on Google Cloud Platform, with exposure to multi‐cloud technologies and modern data engineering.Reliability & Incident Response- Monitor production systems using observability tooling — dashboards, alerts, and logs — to detect and triage issues before they impact end users- Participate in on‐call rotations, respond to incidents following established runbooks, and escalate appropriately when needed- Contribute to blameless post‐mortems, documenting root causes and follow‐up action items to prevent recurrence- Help maintain and improve SLO dashboards and alerting thresholds to ensure platform health is visible and measurableToil Reduction & Automation- Identify repetitive manual tasks and build automation to eliminate them, reducing toil for yourself and the broader team- Write and maintain scripts, tooling, and CI/CD pipeline components that improve deployment reliability and operational eff...