Flag job

Report

SRE, Python/Java Cloud

Salary

₹15 - 25 LPA

Min Experience

5 years

Location

chennai

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

Required Skills-- Python, Java, C/C++, Ruby, and JavaScript J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure & Docker/K8 RESTful APIs and microservices platform . Experience with any of APM and other monitoring tools such as Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka, DataDog, PagerDuty. JOB DESCRIPTION : SRE, Python/JavaCloud Experience Required: 5-6 years Position Description: Job Description: - As a Site Reliability Engineer, you will play a pivotal role in elevating the performance and dependability of GDI&A platforms and applications. Participating in 24x7 on-call production support rotations and handling incident response to minimize disruptions. Continuously monitoring the availability, reliability, and performance of systems, platforms, and applications, maintaining a holistic view of system health. Regularly review key site technical metrics such as transactions errors, logging, response times, caching strategies, conversion/bounce rates, capacity & resource utilization. Providing primary operational and engineering support for multiple large, distributed software applications. Proactively identify stability risks & work with engineering leadership to establish appropriate mitigation plans. Using automation tools, scripts, and processes to reduce or eliminate repetitive tasks, thereby improving the support provided by Site Reliability Engineering. Creating or modifying terraform files according to Companies formats to develop new monitoring dashboards and alert policies. Skills Required: Python, Java, C/C++, Ruby, and JavaScript J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure & Docker/K8 RESTful APIs and microservices platform Experience with any of APM and other monitoring tools such as Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka, DataDog, PagerDuty. Strong experience with product & development teams to establish error budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators), KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime

Skills

python
java
c/c++
ruby
javascript
j2ee
nosql/sql datastore
spring boot
gcp/aws/azure
docker/k8
restful apis
microservices platform