Senior / Lead Agentic AI & Data Science Engineer

Epitria Consulting

Location: Bengaluru, Karnataka, India
Job type: Full-time

Required skills

AWS
Azure
backend
CloudFormation
compliance
data science
DevOps
Docker
end-to-end
frontend
GCP
GitHub
Helm
Jenkins
Kubeflow
Kubernetes
machine learning
NLP
predictive analytics
product lifecycle
SaaS
Terraform
Vertex

About the role

Website: epitria.com
Job details:

We are looking for a Senior Agentic AI & Data Science Engineer with a deep product engineering background to architect, develop, deploy, and operate production-grade AI systems .

The role requires end-to-end ownership of AI products—covering agent design, ML modeling, system architecture, MLOps, multi-cloud deployment, security, and scalability . The ideal candidate combines strong AI research intuition with real-world engineering excellence .

7–10 years total experience in Data Science, AI/ML Engineering, and Product Engineering

Strong hands-on experience in building, deploying, and scaling Agentic AI systems in production

Location - Bengaluru

Work Timings - 12 PM - 9 PM IST

Salary Range: INR 40-41 LPA

Core Responsibilities

Agentic AI & LLM Systems

Design, implement, and optimize Agentic AI architectures involving planning, reasoning, memory, tool-use, and orchestration.
Build and manage multi-agent systems for complex workflows, automation, and decision intelligence.
Implement Retrieval-Augmented Generation (RAG) pipelines with structured and unstructured data sources.
Integrate AI agents with enterprise APIs, databases, SaaS platforms, and internal tools .
Develop robust prompt strategies, agent workflows, fallback mechanisms, and evaluation pipelines.
Deploy and operate LLM-based systems with cost, latency, reliability, and safety considerations.

Data Science & Machine Learning

Build, train, evaluate, and deploy ML/DL models across NLP, structured data, time-series, recommendation, and predictive analytics.
Perform data exploration, feature engineering, statistical analysis, and hypothesis testing .
Design scalable training pipelines , experiment tracking, and model versioning.
Monitor model performance, drift, bias, and data quality in production environments.
Apply explainability and interpretability techniques where required.

Product Engineering & System Design

Own the full AI product lifecycle : problem definition → design → development → deployment → monitoring → iteration.
Translate business and product requirements into scalable, modular, and maintainable AI solutions .
Design distributed, fault-tolerant, and extensible architectures for AI platforms.
Collaborate closely with product managers, UX, backend, frontend, and platform teams .
Enforce engineering best practices including code quality, testing, documentation, and performance optimization .

Multi-Cloud & Infrastructure Engineering

Design, deploy, and operate AI systems across AWS, Azure, and GCP (multi-cloud or hybrid).
Use Docker, Kubernetes, Helm , and cloud-native services for scalable deployments.
Implement Infrastructure as Code (IaC) using Terraform / CloudFormation.
Leverage managed AI/ML services where appropriate (SageMaker, Vertex AI, Azure ML).
Optimize cloud resource utilization and cost across environments.

Security, Governance & Reliability

Ensure data security, privacy, and compliance across AI systems.
Implement secure access control, secrets management, and encrypted data pipelines.
Apply Responsible AI practices : bias detection, fairness, explainability, auditability.
Design systems for high availability, disaster recovery, and fault tolerance .
Establish governance standards for models, data, and AI agents.

Technical Leadership & Collaboration

Provide technical guidance and mentorship to junior engineers and data scientists.
Lead architecture discussions, technical reviews, and best-practice adoption.
Drive innovation in AI/Agentic systems aligned with product and business goals.
Communicate complex technical concepts clearly to both technical and non-technical stakeholders.

Cloud, DevOps & MLOps

Strong hands-on experience with AWS, Azure, and/or GCP (at least two preferred)
Docker, Kubernetes, Helm
CI/CD: GitHub Actions, GitLab CI, Jenkins
MLOps tools: MLflow, Kubeflow , cloud-native ML platforms
Monitoring and observability tools

Architecture & Distributed Systems

Distributed systems and event-driven architectures
Asynchronous processing and workflow orchestration
Scalability, reliability, and performance engineering

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.