Senior ML Ops Engineer
Location: On-site (Gurgaon)
Job Type: Full-Time
eGenome.ai Company Profile:
eGenome.ai is an AI first health-tech platform, that aspires to be the service provider for the one and the only health plan you are going to need for life, tracking your health in real time, identifying your vulnerabilities to critical illnesses resulting in timely interventions, manage your weight and retain/build muscle mass, get personalized diet/lifestyle interventions by a nutritionist that work for you. eGenome.ai’s offering includes access to a 24X7 AI Doctor for consultation on any disorder or to take a second opinion.
The Company also has a dedicated array of high-quality snacks and supplements under the B’spoke brand to support the management of various critical illnesses like diabetes, fatty liver, gut health, obesity etc. The company was founded by serial entrepreneurs who have a track record of creating trailblazers including the first electric truck in India (IPLTech Rhino) and the one-of-its kind logistics platform for heavy trucks/bulkers (Infraprime Logistics).
They have managed to crack hard to enter markets with an extraordinary array of perseverance, clarity of purpose and creativity. The Company is well capitalized and is expected to be a leader in several categories within next two years.
About the Role
We are looking for a Senior MLOps Engineer to lead and optimize our machine learning model deployment pipeline. You will be responsible for deploying fine-tuned Hugging Face models on AWS SageMaker, managing cloud infrastructure, improving model inference performance, and ensuring smooth integration into our FastAPI-based inference service. If you have expertise in AWS, Docker, SageMaker, CI/CD, and MLOps best practices, this role is perfect for you!
Responsibilities
- Deploy and manage Hugging Face models on AWS SageMaker for low-latency inference.
- Optimize inference performance, ensuring minimal response times and cost-efficient scaling.
- Develop and maintain MLOps pipelines for continuous integration and deployment (CI/CD) using GitHub Actions.
- Automate infrastructure provisioning with Terraform or CloudFormation.
- Monitor and troubleshoot model performance using AWS CloudWatch and SageMaker logs.
- Secure and optimize AWS resources for cost efficiency and scalability.
- Integrate models with FastAPI-based applications and manage Docker containerized deployments.
- Ensure model versioning and reproducibility using MLflow or similar tools.
- Collaborate with ML engineers, backend developers, and DevOps teams to streamline deployment workflows.
Requirements
Must-Have:
- Experience in MLOps and DevOps.
- Strong expertise in cloud-based ML Ops and DevOps, encompassing the full model deployment lifecycle, infrastructure automation, and system scalability.
- Proficient in managing cloud services for machine learning workflows, including model training, deployment, monitoring, and optimization.
- Skilled in implementing scalable, resilient architectures using cloud computing, containerization, CI/CD pipelines, infrastructure as code (IaC), and serverless technologies to ensure seamless integration and operational efficiency.
- Proficiency in Docker, FastAPI, and Kubernetes for serving machine learning models.
- Proficiency in Springboot for microservices
- Experience with CI/CD pipelines (GitHub Actions, GitLab CI, or AWS CodePipeline).
- Hands-on experience with boto3, Sagemaker SDK, and AWS CLI.
- Strong Python skills, with experience in FastAPI, Transformers, and Hugging Face APIs.
- Understanding of optimizing large models using techniques like bitsandbytes, quantization, or model parallelism.
- Familiarity with LangChain for LLM orchestration.
- Knowledge of security best practices for cloud-based ML deployments.
Nice-to-Have:
- Experience with Terraform or CloudFormation for infrastructure automation.
- Knowledge of Vector Databases (like Pinecone, FAISS, or ChromaDB) for efficient retrieval.
- Familiarity with monitoring tools like Prometheus and Grafana.
- Exposure to cost optimization strategies for AWS-based model serving.