Cloud Engineer

Network Science

Location: Mumbai Metropolitan Region
Job type: Full-time

Required skills

Python
AWS
Azure
backend
Bash
CDN
cloud infrastructure
CloudFormation
CloudFront
CloudWatch
cross-functional
Datadog
DNS
EC2
ECS
enterprise SaaS
GCP
Git
Kubernetes
Lambda
Linux
load balancing
multi-tenant
Root Cause Analysis
SaaS
Shell Scripting
Terraform
VPC

About the role

Network Science

Website: networkscience.ai
Job details:

Location: Onsite – Mumbai

Reports to: Head of Infrastructure / CTO

You must be available to join this role immediately

Network Science is a global AI innovation platform powering enterprise AI transformation through a metric-backward approach. With 150+ AI projects delivered and a curated ecosystem of 70+ deep-tech startups, we help enterprises move from AI ambition to measurable business outcomes. As we scale our core technology platform, we are looking for a Cloud Engineer who thrives in complex, high-availability environments and takes pride in keeping systems reliable, secure, and performant.

What You Will Own

Cloud Infrastructure & Operations:

• Own, manage, and optimize AWS cloud infrastructure across development, staging, and production environments.

• Monitor system health, respond to incidents, and drive root cause analysis to prevent recurrence.

• Ensure high availability, fault tolerance, and disaster recovery across all cloud-hosted services.

• Make infrastructure decisions that balance cost, performance, and reliability.

AI & Platform Support:

• Support backend and AI/ML workloads running on AWS — including inference endpoints, data pipelines, and model-serving infrastructure.

• Collaborate with engineers to understand customer requirements and design tailored cloud solutions.

• Build and maintain infrastructure for training and inference clusters, working with large scale models like LLMs.

Security & Compliance:

• Implement and enforce AWS security best practices — IAM policies, VPC design, encryption, and access controls.

• Conduct regular audits, vulnerability assessments, and ensure compliance with enterprise security standards.

• Apply principle of least privilege across all cloud services and environments.

Automation & DevOps:

• Automate infrastructure provisioning and configuration using IaC tools (Terraform, CloudFormation, or CDK).

• Build and maintain CI/CD pipelines to streamline deployment and reduce manual intervention.

• Develop runbooks, alerting rules, and self-healing mechanisms to minimize operational toil.

Collaboration & Ownership:

• Work closely with product, backend, AI/ML, and DevOps teams — no ticket-passing culture.

• Translate infrastructure requirements into clear technical designs and implementation plans.

• Take responsibility for systems in production — build it, ship it, own it.

• Share knowledge with peers, help debug cross-functional issues, and improve team workflows.

• You don't wait for instructions when something is broken — you investigate, communicate, and fix.

What We Expect You to Be Good At

Core Skills (Non-Negotiable):

• 3–4 years of hands-on experience with AWS, including:

• EC2, ECS/EKS, Lambda, S3, RDS, CloudFront

• VPC, IAM, Route 53, CloudWatch, AWS Config

• Cost management, reserved instances, and resource optimization

• Strong understanding of networking fundamentals — DNS, load balancing, firewalls, and CDN.

• Experience with Linux systems administration and shell scripting.

• Proficiency in at least one scripting/automation language (Python, Bash, or similar).

• Comfortable working with Git, code reviews, and collaborative engineering workflows.

Cloud & AI Application Focus:

• Experience supporting AI/ML workloads on AWS (SageMaker, Bedrock, or equivalent).

• Understanding of how AI models are deployed and served at scale — latency, throughput, and fallback strategies.

• Ability to design and maintain infrastructure that supports high-throughput, low-latency AI-powered services. Engineering Mindset:

• You think in systems, not just tickets.

• You ask, "Will this hold under load?" before you ask, "Is it running?"

• You care about reliability, observability, and maintainability as much as resolution speed.

Must-Have Qualifications:

• 3–4 years of experience working as a Cloud or Infrastructure Engineer with AWS as the primary cloud platform.

• AWS certification preferred (Solutions Architect Associate or above).

• Experience working in fast-paced, high-growth environments.

• High empathy with high performance — you care about quality AND outcomes.

• Deep ownership mindset: you love fixing problems before they are noticed.

• Comfortable collaborating with AI/ML and backend teams, understanding their constraints, and translating them into reliable infrastructure.

Good to Have (Signals of Maturity):

• Experience with multi-cloud environments (GCP or Azure alongside AWS).

• Familiarity with Kubernetes (EKS) and container orchestration at scale.

• Exposure to MLOps concepts — model versioning, A/B testing, canary releases for AI features.

• Experience with logging, monitoring, and alerting stacks (ELK, Prometheus, Grafana, Datadog).

• Experience working on multi-tenant platforms or enterprise SaaS products.

How We Measure Success:

• Cloud infrastructure is reliable, observable, and scales without constant firefighting.

• Incidents reduce over time — not increase — especially around AI-integrated workloads.

• Deployments are automated, repeatable, and trusted by engineering teams.

• Other teams rely on the infrastructure you own as a stable foundation to build on.

What You Won't Find Here:

• Micromanagement disguised as process

• Endless meetings without decisions

• "Just patch it" thinking that creates long-term mess

We value clarity, accountability, and strong engineering judgment.

What We Offer:

• Opportunity to work on impactful, real-world AI products and platforms.

• High ownership and autonomy.

• Fast learning and growth environment, working closely with AI/ML experts.

• Competitive compensation based on experience.

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.