Senior DevOps Engineer - GPT

Salary

₹30 - 40 LPA

Min Experience

8 years

Location

Chennai

JobType

full-time

About the role

We are hiring for a leading client based out of Chennai, seeking an experienced AWS DevOps Engineer with strong Site Reliability Engineering (SRE) skills. This role is ideal for professionals with 8–11 years of experience in managing, automating, and optimizing cloud infrastructure while ensuring reliability, scalability, and efficiency.

Skills Required

DevOps & AWS Expertise

  • Strong experience with AWS services (EC2, S3, Lambda, RDS, CloudFormation, etc.).
  • Expertise in CI/CD pipelines using tools like Jenkins, GitLab CI, or AWS CodePipeline.
  • Experience with Infrastructure as Code (IaC) tools like Terraform or AWS CloudFormation.
  • Proficiency in configuration management tools like Ansible, Chef, or Puppet.
  • Hands-on experience with containerization technologies like Docker and orchestration tools like Kubernetes (EKS preferred).

Site Reliability Engineering (SRE)

  • Strong skills in monitoring, logging, and alerting tools (e.g., CloudWatch, Prometheus, Grafana, ELK stack).
  • Expertise in incident management, root cause analysis, and post-mortem processes.
  • Proven ability in designing scalable and fault-tolerant systems.
  • Solid scripting knowledge (Python, Bash, or similar) for automation and operational tasks.
  • Familiarity with SLA/SLO/SLI concepts and their implementation in production systems.

Responsibilities

  • Design, deploy, and manage scalable, secure, and cost-effective AWS infrastructure.
  • Implement and optimize CI/CD pipelines to enable rapid delivery of features.
  • Ensure system reliability and uptime through effective monitoring, logging, and incident response.
  • Collaborate with cross-functional teams to implement infrastructure as code and automation practices.
  • Troubleshoot and resolve complex cloud infrastructure and application issues.
  • Drive performance optimization, cost management, and capacity planning.
  • Contribute to the development of operational best practices and documentation.
  • Proactively work on improving system availability, performance, and security.

Skills

DevOps
AWS
SRE