Report

Site Reliability Engineer

Salary

$130k - $180k

Min Experience

5 years

Location

remote

JobType

full-time

About the job

Info This job is sourced from a job board

Overview

About the role

The Site Reliability Engineer will be responsible for operating and scaling the core HashiCorp SaaS platforms (Terraform Cloud and Vault Cloud). You will work closely with product, engineering, and security teams to build and run highly available, performant, and secure infrastructure. This role requires deep knowledge of cloud infrastructure, observability, incident response, and DevOps practices. You will participate in an on-call rotation to respond to production incidents and outages. Responsibilities: - Design, build, and operate highly available, fault-tolerant, and secure infrastructure for HashiCorp SaaS platforms - Drive improvements to our service reliability, observability, and incident response practices - Collaborate with cross-functional teams to ensure SLIs, SLOs, and SLAs are met - Proactively identify and remediate operational risks - Automate infrastructure provisioning, configuration, and deployments - Participate in an on-call rotation to respond to production incidents and outages Requirements: - 5+ years of experience as a Site Reliability Engineer or equivalent - Proficient in infrastructure as code tooling (e.g. Terraform, CloudFormation, Ansible) - Expertise in cloud infrastructure (AWS, GCP, Azure) and networking - Hands-on experience with containerization and container orchestration (e.g. Kubernetes) - Solid understanding of distributed systems, observability, and incident response - Strong scripting and programming skills (e.g. Python, Go, Bash) - Excellent troubleshooting and problem-solving skills - Effective communication and collaboration skills

About the company

HashiCorp helps organizations automate hybrid cloud environments with a unified approach to Infrastructure and Security Lifecycle Management.

Skills

terraform

cloud

kubernetes

python

bash