Flag job

Report

Site Reliability Engineer II - Remote

Min Experience

2 years

Location

remote

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

Are you passionate about improving business processes? Do you enjoy working with a diverse multi-national team of engineering talents? Join our highly skilled Site Reliability team. Our team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We specialize in creating solutions that help improve observability and enforce SLAs across all internal teams. Job Responsibilities Deploying and maintaining our observability platform and internal tooling Partnering across teams to ensure the reliability, scalability and usability of our products and services Providing guidance to engineers and developers to increase confidence that their services are performing as expected Collaborating with our support, operations, and engineering teams to investigate and troubleshoot complex problems Improving monitoring and analysis platforms to ensure rapid error detection and remediation, including developing automated remediation Participating in on-call rotations, guiding restoration and repair of service-impacting issues Requirements Have 2 years of experience and a Bachelor's Degree in Computer Science or its equivalent experience Experience developing applications and scripts using languages such as Python, Bash, Go, Rust, or similar Have familiarity with infrastructure-as-code tools such as Terraform or Pulumi Have proficiency with a configuration management tool such as Ansible, Salt Stack, Chef, Puppet, or similar Have experience with continuous integration / continuous deployment tools such as Jenkins, Git hub Actions, or similar Have experience with one or more observability tools such as Prometheus, Nagios, Grafana, Loki, ELK, New Relic

Skills

Ansible
Automated
Computer Science
Configuration Management