T D Newton & Associates
Website:
tdnewton.com
Job details:
Job Description – Consultant DevOps Engineer (13+ Years Experience)
We are seeking an experienced Consultant DevOps Engineer to lead complex engineering initiatives across mission‑critical banking and investment platforms. In this strategic and hands-on role, you will be responsible for designing, modernizing, and driving the DevOps roadmap across high‑availability, low‑latency, and risk‑sensitive environments, including pre‑trade, market data, and real‑time processing systems.
The ideal candidate brings 13+ years of deep DevOps engineering experience, strong expertise in infrastructure automation, CI/CD modernization, production reliability, observability, and platform governance, along with the ability to influence and guide cross‑functional engineering teams. This role requires a strong production‑first mindset, proven technical leadership, and the ability to operate in controlled enterprise environments with high expectations of stability, security, and operational rigor.
Key Responsibilities
- Design, implement, and scale end‑to‑end DevOps frameworks, including CI/CD modernization, automated infrastructure provisioning, and operational tooling across business‑critical systems.
- Drive infrastructure automation initiatives to improve reliability, consistency, and resilience across distributed and latency-sensitive platforms.
- Architect monitoring, alerting, and observability solutions for highly available, production‑grade environments, leveraging Python and industry-leading monitoring stacks.
- Define best practices, governance, and engineering standards for DevOps, automation, and operational excellence across global engineering teams.
- Act as a technical consultant to development, infrastructure, platform engineering, and production support teams, guiding them on automation and operational improvements.
- Partner with security, network, and infrastructure teams to ensure compliance with enterprise standards, risk controls, and regulatory requirements.
- Lead incident reviews, stabilize production platforms, and drive root cause analysis with a focus on long-term remediation and operational maturity.
- Oversee production readiness, deployment automation, environment consistency, and configuration management across Unix/Linux ecosystems.
- Manage obsolescence remediation and vulnerability management across infrastructure and application environments in line with enterprise risk and security standards.
- Coordinate and optimize server provisioning, decommissioning, patching cycles, and environment setup activities in line with enterprise hygiene standards.
- Troubleshoot and resolve complex production issues across applications, services, scripts, batch jobs, and market data scripts/pipelines.
- Support and validate firewall, and network flow requirements, ensuring secure and compliant connectivity across source‑to‑destination systems.
- Implement operational controls, audit traceability, and deployment discipline required for financial infrastructures.
- Lead disaster recovery (DR) design validation, failover readiness, and operational resilience exercises.
Required Qualifications
- Bachelor’s degree (BE / B.Tech.) in Computer Science, Information Technology, or a related engineering discipline.
- 8+ years of hands-on Python programming experience. Tooling, maintenance and debugging of python script/application.
- 13+ years of experience in DevOps engineering, infrastructure automation, production activities, and CI/CD in enterprise environments.
- Strong hands-on experience in:
- Unix/Linux systems – Unix/Linux systems, including Unix commands, Shell scripting, System utilities, Server-level troubleshooting, and administration.
- Containerization & Orchestration – Docker, Kubernetes.
- CI/CD Platforms – GitHub Actions, Jenkins, or equivalent.
- Observability & Monitoring – Prometheus, Grafana, Kibana, Elasticsearch, or similar.
- Proven experience with incident management, production issue analysis, and production reliability engineering.
- Strong understanding of production operations in a controlled enterprise environment, including release discipline, operational governance, and platform stability.
- Excellent communication, leadership, and stakeholder‑management skills, with the ability to influence cross‑functional teams across multiple geographies.
- Demonstrated ownership, accountability, and a strong production‑first mindset.
Good to Have
- Experience in banking, trading, investment platforms, or market data systems.
- Hands-on exposure to Infrastructure as Code (IaC) – Terraform, Ansible, or similar tools.
- Knowledge in weekend infrastructure checks, disaster recovery (DR) exercises, failover readiness, and platform resilience activities.
- Knowledge of Java-based enterprise applications and ability to support Java workloads at the DevOps level.
- Experience in cloud platforms (Azure, AWS, GCP) and hybrid cloud automation.
Click on Apply to know more.