PwC
Website:
pwc.com
Job details:
Instructions
Please update areas marked in red
Link to Tips & Tricks for Writing PwC Job Description
- Quick Tips for Reviewing your JD!
- Make sure you have the appropriate header sentence based on the level of the JD (i.e. Manager level role should start with appropriate descriptor “Demonstrates extensive abilities and/or a proven record of success as a team leader:” The appropriate header can be found in the Tips and Tricks document provided above.
- Be mindful of grammatical consistency. the list should either be all verb-driven or all noun-driven (but not both).
- When listing requirements under the required or preferred skills section, each sentence should end in a semi-colon (.) except for the last bullet which should end with a period (.)
Job Profile Name
Child Name
Global LoS
Global Network
Global Competency Network
Go-To-Market
Managed Services
Sector
Not Applicable
Programme Type
Experienced
Additional Responsibilities: (This field may be used to describe the daily role, duties and/or purpose of this Job Profile/Job Description. The field is limited to 500 characters, including spaces.)
Leads reliability improvements across applications, platforms, and cloud systems. Drives automation, enhances observability, optimizes performance, and conducts root-cause analysis. Partners with engineering teams to reduce toil, improve operational maturity, and strengthen service resilience.
Minimum Degree Required: Bachelors
Degree Preferred: Bachelors or master’s in science, Computer Science, Engineering
Minimum Years of Experience: 5-7 year(s)
Certifications Required: None
Certifications Preferred: AWS Solutions Architect Associate; Azure Administrator; Kubernetes CKA; Terraform Associate; ITIL Foundation, Observability certifications, Scripting and Coding Certifications will be great as well.
Required / Mandatory Knowledge/Skills: (character count limit 5000)
*PLEASE ONLY USE THIS FIELD IF THIS IS A MUST HAVE SKILL FOR APPLICANT*
- Strong understanding of SRE practices including SLIs/SLOs, error budgets, service health, and operational KPIs
- Ability to automate operational tasks using Python, Shell, PowerShell, Go, or similar languages
- Experience improving alerting systems, reducing noise, and refining observability instrumentation
- Proficiency with cloud platforms and core services (compute, storage, networking, serverless)
- Experience executing root-cause analysis and problem management
- Ability to lead incident response and coordinate cross-team troubleshooting
- Experience identifying systemic reliability gaps and proposing engineering solutions
- Ability to design performance tests, validate reliability risks, and assess scalability
- Strong communication skills for partnering with development, operations, and leadership
Preferred Knowledge/Skills: (character count limit 5000)*
PLEASE MAKE THIS A BULLETED LIST WHERE EACH SENTENCE STARTS WITH THE SAME VERB TENSE (I.E. PROVIDES, DEVELOPS, FACILITATES, ETC.)
- Leads tuning of monitoring rules, dashboards, and reliability metrics;
- Leads development of automation to reduce operational toil and manual interventions;
- Leads incident response actions and service stabilization procedures;
- Leads post-incident reviews and contributes to long-term fixes;
- Leads resilience initiatives such as chaos testing and failover drills;
- Leads capacity forecasting and risk identification;
- Leads refinement of operational standards, documentation, and runbooks;
- Leads collaboration with product and engineering teams to embed reliability requirements.
Click on Apply to know more.