About the role
Roles & Responsibilities
● Develop and improve instrumentation for monitoring the health and availability, security, efficiency, and scalability of systems
● Troubleshoot and fix production outages and performance issues.
● Build automation tools for provisioning and managing our cloud infrastructure by leveraging the GCP and AWS Cloud Services.
● Contribute to enhancing and managing our continuous delivery pipeline
● Automate systems configuration by writing policies and modules for configuration management tools
● Enhance and maintain our logs collection, processing, visualization and alerting infrastructure.
● Should be ready to explore/POC on new technologies which will help the infrastructure to perform better
Skills and Requirements
● 2-4 years of professional experience.
● Must have strong automation/scripting skills (bash/python is mandatory, ruby is a plus).
● Experience supporting a managed cloud services infrastructure (AWS/GCP, preferably GCP).
● Ability to maintain, monitor and optimize production database servers.
● In-depth knowledge on Linux Environment(Centos/Ubuntu).
● Designing and implementing highly efficient solutions on Pubic Cloud for security, resilience, performance, networking, availability, Blue-green deployments in context of business application.
● Experience in capacity planning & design, cost and effort estimations and experience in Infra cost optimization.
● In-depth experience in setting-up/configuring Devops tools like redash, consul, grafana, ELK.
● Working experience on Rundeck is a plus.
● Experience with continuous integration and deploy tools (Jenkins is preferable ) and familiarity with configuration management tools (Working experience on Ansible is preferred).
● Independent ownership attitude and a track record of taking responsibility for problems and pushing through to the resolution.
● Strong networking skills to debug/fix any network related outages both in cloud/server stand point.
● Working knowledge on Terraform, Kubernetes is a must.