Flag job

Report

Site Reliability Engineering (SRE)

Location

Mumbai Metropolitan Region

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

Viraaj HR Solutions Private Limited

Website: viraajhrsolutions.com
Job details:
About The Opportunity

A recruitment partner serving technology and cloud infrastructure organizations across India, placing engineering teams into enterprise and scale-up environments. We are hiring an on-site Site Reliability Engineer to join a fast-moving operations and platform team that owns availability, scalability and observability for critical production systems. Location: India — On-site.

Primary Title

Site Reliability Engineer

Role & Responsibilities

  • Operate and optimize production Kubernetes clusters and underlying Linux-based infrastructure to ensure high availability and fault tolerance.
  • Implement Infrastructure as Code (IaC) using Terraform and manage cloud resources across AWS to support secure, repeatable deployments.
  • Build and maintain monitoring, logging and alerting stacks (SLO/SLI-driven): implement Prometheus metrics, dashboarding, and automated alerts with clear runbooks.
  • Design and own CI/CD pipelines to automate build, test, and release workflows; reduce manual deployments and rollout risk.
  • Lead incident response and postmortems: triage outages, identify root causes, implement remediation and reliability improvements.
  • Develop automation tools and scripts to eliminate operational toil, support capacity planning, and collaborate with development teams on performance and reliability engineering.

Skills & QualificationsMust-Have

  • Kubernetes
  • Linux
  • Terraform
  • AWS
  • Prometheus
  • CI/CD

Preferred

  • Grafana
  • Python
  • ELK Stack

Benefits & Culture Highlights

  • Hands-on ownership of production systems with clear career growth into platform and reliability leadership roles.
  • Collaborative engineering culture that prioritizes automation, observability, and measurable SLAs.
  • On-site role offering close collaboration with cross-functional product and infrastructure teams.

How to apply: Candidates who meet the Must-Have skills and are open to on-site roles in India are encouraged to apply. This role is ideal for engineers who enjoy automating infrastructure, improving service reliability, and driving operational excellence.

Skills: prometheus,kubernetes,devops,aws,reliability,terraform,linux Click on Apply to know more.

Skills

Python
AWS
automation tools
capacity planning
cloud infrastructure
cross-functional
DevOps
incident response
Kubernetes
Terraform