EXL
Website:
exlservice.com
Job details:
Role Overview:
We are hiring an experienced SRE leader to manage a global Incident Management team and drive operational excellence for this engagement. The role involves leading teams, handling complex incidents, and improving overall incident response strategy.
Key Responsibilities:
- Lead and mentor a team of SRE engineers (Level 7 ICs)
- Own end-to-end incident management operations across regions
- Establish and drive incident response processes and governance
- Ensure effective 16x7 delivery model across geographies
- Act as escalation point for critical incidents and stakeholder communication
- Drive continuous improvements in MTTM and operational efficiency
- Lead process enhancements, SOP creation, and knowledge transfer planning
Required Skills:
- Strong experience in SRE / Incident Management leadership
- Proven ability to manage high-impact, complex incidents
- Excellent communication, stakeholder management, and leadership skills
- Ability to drive alignment and influence cross-functional teams
Technical Skills:
- Ability to proactively identify risks using monitoring tools such as DataDog and Grafana dashboards
- Experience in incident response with capability to quickly restore services (restart, patch, or remediate live issues)
- Strong focus on minimizing service downtime across environments
- Hands-on experience supporting both on-premise (Linux environments) and cloud platforms (primarily Azure, with some exposure to GCP)
- Solid understanding of networking concepts and system architecture
Click on Apply to know more.