SRE & DevOps Engineer
algoleap
- Location
- Hyderabad, Telangana, India
- Job type
- Full-time
Required skills
- automation tools
- change management
- compliance
- Datadog
- DevOps
- incident response
- load balancing
- Root Cause Analysis
- SRE
About the role
algoleap
Website:
algoleap.com
Job details:
Job Title: SRE and DevOps Engineer
Job Location: Hyderabad /Gurugram/Noida
Start Date: As soon as possible
Key Responsibilities
· Work closely with development, operations, and product teams to ensure monitoring solutions align with business goals.
· Create and maintain scripts and automation tools to streamline monitoring and alerting processes
· Produce and maintain clear documentation on monitoring setups, best practices, and troubleshooting procedures.
· Train team members and stakeholders on effective use and management of Datadog tools and features.
- Monitor the performance and availability of software systems, identify and resolve issues, and implement proactive measures to prevent future incidents.
- Design and maintain fault-tolerant architectures using redundancy, load balancing, and automated failover mechanisms to minimize downtime and ensure seamless service availability.
- Develop and implement automation strategies to reduce manual intervention and improve system reliability.
- Optimize system performance through proactive monitoring and tuning.
- Prepare and execute disaster recovery plans to ensure business continuity.
- Work closely with development and operations teams to bridge the gap between them, ensuring smooth deployment and operation of applications.
Incident Management
· Follow incident management process, ensuring timely resolution and minimizing service disruptions.
· Conduct root cause analysis and implement preventive measures to reduce recurring incidents.
· Develop and maintain incident response procedures and communication protocols.
Change Management
· Manage the change management process, ensuring controlled and efficient implementation of changes
· Assess the impact of proposed changes and mitigate potential risks.
· Ensure compliance with change management policies and procedures.
Metrics And Eporting
· Generate regular reports and dashboards to provide insights into service performance.
· Use data-driven insights to identify trends and drive continuous improvement.
Transformation And Automation
· Identify opportunities for process automation and implement solutions to improve efficiency.
· Evaluate and implement new monitoring tools
Key Requirements
· Proven expertise in multiple monitoring tools
· Minimum of 8 years of experience in monitoring and DevOps skills.
· Proficiency in scripting, coding and software development principles
· Strong understanding of IT operations and system management.
· Strong experience with automation tools and frameworks.
· Excellent troubleshooting and problem-solving skills.
·
Click on Apply to know more.
This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.