L

Site reliability engineer (Xico)

Link-Worldwide

valle de chalco solidaridad, estado de méxico, Mexico Full-time June 27, 2026

Found Description

This role is responsible for building, operating, and scaling highly reliable ai/ml and cloud infrastructure platforms. The position combines site reliability engineering (sre), platform engineering, and ai operations (aiops) to ensure production systems remain stable, automated, and scalable.

Key Responsibilities

  • Build and scale agentic ai systems for incident triage, anomaly detection, and self-healing automation.
  • Maintain and improve the reliability and performance of ai/ml model-serving infrastructure.
  • Operate, optimize, and scale distributed cloud-native systems.
  • Drive automation initiatives to reduce manual operational work and improve efficiency.
  • Define and manage slos, monitoring, observability, and incident response processes.
  • Participate in troubleshooting, root-cause analysis, and continuous system improvement.

Required Skills & Experience

  • 5+ years of experience in sre, produc...

Ready to Apply?

Submit your application for Site reliability engineer (Xico) at Link-Worldwide

Apply Now