Found Description
Make a significant impact at Confluent as an Expert Site Reliability Engineer focused on incident management and reliability enhancements. You'll work within a multi-cloud architecture to optimize performance and reliability.
This expert role blends 75% technical engineering with 25% strategy, involving the analysis of systemic failure patterns, designing reliability frameworks, and teaching best practices. You'll be instrumental in developing incident response processes that facilitate organizational success and sustainability. Join a global team dedicated to improving cloud-based reliability.
Key Responsibilities: • Analyze and improve systemic failure patterns • Own configuration and workflows for incident management tools • Define SLO/SLA frameworks to guide reliability investments • Edit incident documents for customer clarity • Lead training programs and coach teams through post-mortems
Requirements: • 10+ years of experience in SRE or incident manageme...
This expert role blends 75% technical engineering with 25% strategy, involving the analysis of systemic failure patterns, designing reliability frameworks, and teaching best practices. You'll be instrumental in developing incident response processes that facilitate organizational success and sustainability. Join a global team dedicated to improving cloud-based reliability.
Key Responsibilities: • Analyze and improve systemic failure patterns • Own configuration and workflows for incident management tools • Define SLO/SLA frameworks to guide reliability investments • Edit incident documents for customer clarity • Lead training programs and coach teams through post-mortems
Requirements: • 10+ years of experience in SRE or incident manageme...
Ready to Apply?
Submit your application for Expert Site Reliability Engineer at Confluent at IBM
Apply Now