Flag job

Report

Senior Cloud Operations Engineer - AWS

Min Experience

8 years

Location

Bengaluru

JobType

Full-Time

About the job

Info This job is sourced from a job board

About the role

Role Overview: We seek a top-tier Senior Cloud Operations Engineer - AWS to lead strategic cloud operations, drive innovation, and shape product direction. Only candidates with proven expertise in large-scale, high-spend AWS environments will be considered. Key Responsibilities: Cloud Operations & Architecture: Design, secure, and optimize cloud infrastructure handling $1M+/year spend across 100+ AWS accounts. Automate operations using Python, CloudFormation, and Systems Manager to enable proactive scaling and cost-triggered remediations. Solve complex failures in distributed systems using CloudWatch, X-Ray, and native tooling. Implement measurable cost optimizations (e.g., "Achieved 35% savings via rightsizing/Spot adoption"). Enforce least-privilege security, encryption, and compliance via automated audits. Maintain >99.9% uptime through rigorous incident management and chaos engineering. Research & Product Leadership: Lead research initiatives and POCs to explore new technologies and methodologies. Provide opinionated insights and recommendations to influence product strategy and roadmap. Collaborate with cross-functional teams to translate research findings into actionable product features. Generative AI & Emerging Technologies: Design and implement generative AI applications using Amazon Bedrock, leveraging foundation models from providers like Anthropic and AI21 Labs. Develop and manage agentic AI workflows utilizing Amazon Bedrock Agents, including custom orchestrators for complex task automation. Integrate Model Context Protocol (MCP) servers to enhance AI capabilities with domain-specific knowledge and tool access. Collaborate with cross-functional teams to translate research findings into actionable product features. Stay abreast of industry trends and emerging technologies to inform product development Desired Skills Must have: Must have worked at an AWS-certified MSP or Enterprise Cloud Center of Excellence (CCOE) serving multiple internal teams. Must be hands-on in supporting production-grade workloads across 100+ AWS accounts with >$100K/month cloud spend. Must have performed 5+ customer assessments such as Formal Technical Reviews (FTRs)/Well-Architected Framework Reviews (WAFRs) with documented optimization results. Must have built proactive/reactive automations for cost, security, and compliance. Must have used AWS-native FinOps/SecOps tools (Security Hub, Config, Cost Explorer) or third-party equivalents. Proven experience in AWS cost optimization and financial operations (FinOps). Strong proficiency in AWS services such as EC2, S3, RDS, Lambda, Bedrock etc. Experience with AWS Cost Explorer, Budgets, and other cost management tools. Proficiency in scripting languages like Python or Bash for automation purposes. Strong analytical and problem-solving skills. Excellent communication and collaboration abilities. AWS certifications such as AWS Certified Solutions Architect or AWS Certified AI Practitioner are preferred. Preferred Skills: Experience with FinOps tools like CloudHealth, Apptio Cloudability, or similar. Knowledge of cloud governance and compliance standards. Familiarity with DevOps practices and CI/CD pipelines. Experience in leading research initiatives and providing strategic product insights. Experience: 8+ years of experience in cloud operations, with a strong focus on AWS services. Hands-on experience designing and operating production-grade infrastructure on AWS. Proficient in writing and managing AWS CloudFormation templates with practical experience. Hands-on experience using Amazon CloudWatch for monitoring, alerting, and dashboard configuration. Performed independent root cause analysis and troubleshooting in cloud production environments. Managed EC2 lifecycle, patching, and configuration using AWS Systems Manager in production setups. Ensured cloud security using IAM policies, encryption standards, and audit mechanisms with best practices. Worked directly with product or customer teams to translate business needs into scalable technical solutions. Led operational readiness and incident management in critical environments with proven examples. Experience in designing or deploying generative AI applications using Amazon Bedrock (preferred). Experience working with Model Context Protocol (MCP) servers or similar AI orchestration frameworks (preferred). Education Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

Skills

python
cloudformation
aws-systems-manager
cloudwatch
x-ray
iam
aws-security-hub
aws-config
aws-cost-explorer
bash
aws-ec2
aws-s3
aws-rds
aws-lambda
aws-bedrock