Found Description
Key Responsibilities
As a Site Reliability Engineer within Advanced Analytics (DA3) in the Chief Data & AI Office at Allianz Partners, you will join the platform engineering team to own the reliability and operational health of the central engineering platform. You will define and maintain service level objectives, drive incident response at the infrastructure layer, and systematically eliminate operational toil through automation. You will work closely with Platform Engineers, Security Engineers, and incident‑response leads to ensure the platform meets its reliability commitments across production workloads spanning AI services, Java APIs, and frontend applications.
- Define, instrument, and maintain SLOs and SLIs for platform components; own error budget tracking and produce regular reliability reports for senior leadership.
- Serve on the on‑call rotation as the infrastructure escalation tier; lead incident response for cluster‑level, network‑leve...