EXL
Website:
exlservice.com
Job details:
About the Company
We are looking for an experienced LLM Ops Engineer to own the end-to-end lifecycle of LLM applications in production - from model selection and pipeline design through fine-tuning, deployment, observability, and continuous improvement. This role sits at the intersection of ML Engineering, DevOps, and Data Engineering, and is critical to ensuring that GenAI systems are reliable, cost-efficient, and scalable in enterprise environments. You will partner closely with AI Research, Product, Platform, and Data Engineering teams.
About the Role
We are looking for an experienced LLM Ops Engineer to own the end-to-end lifecycle of LLM applications in production.
Responsibilities
- Design, build, and maintain end-to-end LLM pipelines - from data ingestion and pre-processing through model training, fine-tuning, and deployment into production.
- Implement and manage CI/CD pipelines for ML/LLM workflows using tools such as MLflow, Kubeflow, GitHub Actions, etc., ensuring reproducibility and fast iteration cycles.
- Own model lifecycle management: versioning, A/B testing, canary deployments, rollbacks, and governance - ensuring models are always production-safe.
- Architect and operate LLM serving infrastructure on cloud or on-premises with high availability, low latency, and cost efficiency.
- Build robust monitoring, observability, and alerting frameworks for model drift, hallucinations, latency, token costs, and quality regressions (LangSmith, Weights & Biases, others).
- Experience with RAG pipelines with vector databases, drive model fine-tuning initiatives for domain-specific applications.
- Establish and enforce LLMOps best practices including prompt versioning, evaluation frameworks, guardrails, PII policies, and audit trails.
- Manage AI Gateway and model routing across multiple LLM providers (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Vertex AI) with unified auth, rate limiting, and fallback logic.
- Optimise inference costs through quantisation, batching strategies, hardware (GPU/TPU) optimisation, and model compression.
- Mentor junior engineers and contribute to internal documentation, and platform tooling.
Qualifications
- B.Tech / M.Tech in CS, AI/ML, Mathematics or equivalent.
Required Skills
- Languages: Python (advanced)
- Frameworks: LangChain, LangGraph, Hugging Face, PyTorch, TensorFlow
- MLOps / Pipeline Tools: MLflow, Kubeflow, Apache Airflow, Prefect
- DevOps / Infra: Docker, Kubernetes, GitHub Actions
- Cloud Platforms: AWS Bedrock, Azure OpenAI, Google Vertex AI
Preferred Skills
- Experience with RAG & Vector DBs, Fine tuning (LoRA, PEFT), LLM Observability (LangSmith, Weights & Biases, others), prompt evaluation.
- Good to have: Security governance (LLM red-teaming, PII redaction, AI safety guardrails), streaming (event driven architecture).
Pay range and compensation package
6 – 10+ Years Overall in software / ML engineering
3+ Years Hands-on production LLM/ML lifecycle
Equal Opportunity Statement
We are committed to diversity and inclusivity.
Click on Apply to know more.