Site Reliability Engineer | Remote

Crossing Hurdles • ciudad de méxico, ciudad de méxico, Mexico • Posted June 04, 2026

Location ciudad de méxico, ciudad de méxico

Job Type Full-time

Category Redes y sistemas

Posted June 04, 2026

Design, implement, and maintain scalable infrastructure using Linux and Kubernetes. 
Monitor system performance using Prometheus and address potential issues proactively. 
Automate operational processes to improve system reliability and efficiency. 
Respond to incidents, perform root cause analysis, and implement improvements. 
Collaborate with development teams to ensure smooth deployments and high availability. 
Create and maintain documentation, runbooks, and operational guidelines. 
Promote best practices in reliability, security, and system performance. 
Requirements  
Strong experience with Linux system administration and troubleshooting. 
Strong expertise in Kubernetes cluster management and orchestration. 
Strong experience using Prometheus for monitoring and alerting. 
Proficiency in scripting languages such as Bash or Python. 
Strong problem-solving and in...
            

Interested in this role?

Click the button below to start your application.

Apply Now