Specialist - Cloud & Infra Management

LTM

Location: Bengaluru, Karnataka, India
Job type: Full-time

Required skills

Python
SIEM
API
CLI
GCP
GitLab
Google Cloud
Linux
Node
proxy
reverse proxy
Reverse Proxy
Vault
VMware
VPC

About the role

Website: ltm.com
Job details:

Role description

Job Title Infrastructure Operations Observability Engineer

The Role:We are looking for a proactive Engineer to manage our hybrid global infrastructure GCP VMware Proxmox and crucially to build and maintain our observability layer You will ensure that every VM container and Vault instance is not just running but is visible metered and ing correctly

You will be the person who ensures that when a service in Tokyo lags the team in London sees it on a dashboard before a customer reports it

Expanded Responsibilities The Observability Mission

Monitoring Setup Deploy and maintain agents like Prometheus Node Exporter Google Cloud.

Ops Agent or Telegraf across Linux VMs in GCP and Proxmox.

The Integration Glue Use Python to sync metadata between GCPProxmox and our observability tools Uptimecom PagerDuty and GrafanaGCP Monitoring.

Dashboard Crafting Create clear actionable dashboards that show the health of applications.such as HashiCorp VaultOpenBao system resources and application uptime.

Hygiene Tune ing thresholds to ensure PagerDuty only fires for real issues reducing noise for the global team.

Log Aggregation Ensure logs from global servers are flowing correctly into a central location like GCP Cloud Logging or the corporate SIEM solution for troubleshooting.

Proxy Management Deploy and tune Nginx as both a reverse proxy handling incoming traffic to VaultOpenBao and a forward proxy controlling egress from our private nodes.

Infrastructure as Code Fluent in uses of IaC technologies and GitLab repositories to drive change and operations.

Web Server Hardening Manage SSLTLS certificates Lets EncryptACME and ensure proxy headers are configured for security and performance

Technical Skills Checklist

1 Observability Operational Monitoring The Priority Synthetic Monitoring Handson experience with creating tools scripts for global health checks.

Incident Management Proficiency in PagerDuty setting up services integrations and grouping.

Metrics Visualisation Familiarity with PrometheusGrafana or GCP Monitoring Stackdriver

Understanding the difference between a Metric how much and a Log what happened

Health Checks Ability to write custom healthcheck endpoints or scripts to verify service integrity

2 Hybrid Infrastructure GCP Proxmox

Proxmox VE Managing VMLXC lifecycles snapshots and basic cluster health

GCP Compute Engine GKE and VPC networking

Linux Advanced CLI skills for performance debugging htop iostat netstat journalctl

3 Automation Security

Python Essential for observabilityascodewriting scripts to automate the creation

of monitors or s via API

VaultOpenBao Maintaining the Observer roleensuring the monitoring tools have the

correct limited permissions to check Vault health

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.