Website:
ltm.com
Job details:
📋 Job Overview
We are seeking a Python + MLOps Engineer with strong Python development skills and deep expertise in MLOps practices. The ideal candidate will be responsible for building, automating, and maintaining end-to-end ML pipelines — from model training and versioning to deployment, monitoring, and scaling in production environments.
📌 Note
Applicants should have hands-on experience in Gen AI + Python and knowledge of Large Language Models (LLMs), RAG pipelines, embeddings, and prompt engineering. Practical experience with AI-driven and GenAI applications is preferred.
🔧 Key Responsibilities
- Design, build, and maintain end-to-end ML pipelines for model training, evaluation, deployment, and monitoring.
- Implement CI/CD pipelines for ML models ensuring automated testing, validation, and deployment.
- Develop and manage containerized ML workloads using Docker and Kubernetes.
- Set up and maintain model registries, experiment tracking, and versioning using MLflow, Weights & Biases, or similar tools.
- Automate data ingestion, feature engineering, and model retraining workflows.
- Monitor model performance in production, implement drift detection, and trigger retraining pipelines as needed.
- Collaborate with data scientists and ML engineers to operationalize models efficiently and at scale.
- Ensure infrastructure scalability, reliability, and cost optimization for ML workloads on cloud platforms.
✅ Must-Have Skills
- Strong proficiency in Python with experience in building production-grade ML applications and automation scripts.
- Hands-on experience with MLOps tools — MLflow, Kubeflow, Airflow, or similar.
- Experience with containerization (Docker) and orchestration (Kubernetes) for ML workloads.
- Proficiency in CI/CD pipelines (Jenkins, GitHub Actions, Azure DevOps) for ML model deployment.
- Working knowledge of cloud platforms (Azure, AWS, or GCP) and their ML/AI services.
- Experience with model monitoring, logging, and drift detection in production environments.
- Familiarity with Infrastructure as Code (Terraform, CloudFormation) and configuration management.
🌟 Good-to-Have Skills
- Exposure to Gen AI and Large Language Models (LLMs) deployment and serving (vLLM, TGI, Triton).
- Knowledge of RAG pipelines, embeddings, and prompt engineering.
- Experience with data versioning tools (DVC, LakeFS).
- Familiarity with distributed training and GPU optimization for large-scale models.
Click on Apply to know more.