Site Reliability Engineer
Location
Hong Kong, Hong Kong SAR
Job Type
Full-time
Category
computer-and-mathematical
Posted
June 16, 2026
Position Overview
We are seeking an experienced Support Analyst responsible for the operational ownership of build and shared services, including monitoring, SRE (Site Reliability Engineering), and the stability and performance of critical systems.
Key Responsibilities
- Monitor and support SRE operations to ensure reliability, availability, and performance of production systems.
- Build, enhance, and maintain monitoring solutions using:
- ITRS Geneos
- Prometheus
- Victoria-Metrics
- Elasticsearch
- Grafana
- Design and maintain alerting rules, dashboards, and observability pipelines.
- Troubleshoot Linux servers (RHEL 7/8/9), including:
- upgrades, configuration changes, patching, and maintenance
- assessing monitoring needs for system changes
- Perform log analysis and fault finding to identify and resolve performance exceptions.
- Collaborate wi...