Our client, a leading player in the crypto/blockchain space, is seeking an experienced Director of Site Reliability Engineering (SRE) to drive the strategy, development, and optimization of their infrastructure. As the SRE Director, you will lead a high-performing team, ensuring system reliability, scalability, and security while collaborating with engineering, security, and product teams.
Key Responsibilities:
-
Lead the development and execution of the SRE strategy, focusing on system reliability, performance, and security.
-
Manage and mentor a team of SREs, fostering a culture of automation, observability, and continuous improvement.
-
Define and manage Service Level Objectives (SLOs), Service Level Agreements (SLAs), and error budgets to maintain a balance between innovation and system stability.
-
Architect and manage containerized environments using Kubernetes and cloud-native technologies.
-
Oversee Infrastructure as Code (IaC) using Terraform, ensuring compliance and repeatability.
-
Build and enhance CI/CD pipelines, improving software delivery and security.
-
Lead observability efforts with tools like Datadog, Prometheus, and OpenTelemetry.
-
Drive incident response, post-mortem reviews, and improvements to system design and operational procedures.
-
Optimize infrastructure for blockchain nodes, validators, and smart contracts, ensuring high availability and security.
Required Qualifications:
-
10+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering.
-
Deep expertise in Kubernetes, containers, and cloud-native architectures.
-
Strong proficiency with Terraform and other Infrastructure as Code (IaC) tools.
-
Extensive experience with AWS and on-prem environments.
-
Hands-on experience with observability tools such as Datadog, Prometheus, and Grafana.
-
Proven track record securing and optimizing blockchain infrastructure (e.g., Ethereum, Solana, Bitcoin).
-
Experience leading high-performing SRE teams and working cross-functionally with engineering and security teams.
-
Strong problem-solving, incident management, and communication skills.
Compensation: $200,000 - 250,000
Salary is based on a range of factors that include relevant experience, knowledge, skills, other job-related qualifications.
#SONITECH
#SONITECH1