Site Reliability Engineer | $70/hr Remote

Crossing Hurdles • Remote, Remote, South-Africa • Posted June 08, 2026

Location Remote, Remote
Job Type Full-time
Category Quality Engineering
Posted June 08, 2026

Responsibilities

  • Deploy, monitor, and recover containerized AI training environments.
  • Troubleshoot infrastructure bottlenecks and resolve system failures in real time.
  • Build and manage resilient systems for stability and performance optimization.
  • Collaborate with engineering teams to improve CI/CD pipelines and automation.
  • Manage filesystem structures, storage, and process scheduling in containerized environments.
  • Execute dynamic replanning during runtime issues and system failures.
  • Document system processes, solutions, and best practices.

Requirements

  • Strong experience with terminal-based system administration and troubleshooting.
  • Expertise in containerized environments such as Docker or Kubernetes.
  • Strong Python skills for scripting, automation, and debugging.
  • Proficiency in Bash and familiarity with additional programming languages.

Interested in this role?

Click the button below to start your application.

Apply Now