Machine Learning Engineer

Antal International

Location: Kolkata, West Bengal, India
Job type: Full-time

Required skills

Python
Backbone
compliance
Docker
embedded systems
end-to-end
Falcon
Linux
machine learning
TensorFlow
Pytorch
ONNX

About the role

Antal International

Website: antal.com
Job details:

Summary role description:

Hiring a Machine Learning (ML) Engineer for one of the leaders in embedded manufacturing.

Company description:

Our client is a high-end tech company playing their part in the backbone of enterprise IT infrastructure. The company is US headquartered and has a global footprint. Their technology is integrated into millions of devices worldwide, including servers, desktops, and embedded systems. With decades of experience in low-level systems development, they play a critical role in shaping the foundational software that powers modern computing platforms.

Role details:

• Title / Designation: Machine Learning (ML) Engineer

• Location: Kolkata

• Experience: 7+ years

Role & responsibilities:

• Build, train, and fine-tune LLMs, and evaluate their performance using techniques like SFT, LoRA/QLoRA, and RLHF.

• Optimize models for local or edge environments by improving speed and efficiency through quantization, pruning, and distillation.

• Deploy models into production on-premises or at the edge using frameworks such as PyTorch, ONNX, TensorRT, vLLM, or llama.cpp, and integrate them into applications via APIs and internal services.

• Design and maintain scalable training and inference pipelines to ensure reproducibility and efficiency.

• Monitor model performance in production, including accuracy, drift, latency, and resource utilization, and continuously optimize outcomes.

• Ensure models meet security, privacy, and compliance requirements, especially in restricted or offline environments.

• Collaborate with software engineers, infrastructure teams, and domain experts to deliver end-to-end AI solutions.

• Document model architectures, training processes, and deployment workflows for clarity and future use.

Candidate requirements:

• Master’s or Ph.D. in Computer Science, Engineering, or a related field, or equivalent practical experience, along with 7+ years of experience in AI/ML, including at least 2 years working on LLMs, large-scale neural networks, RAG, or AI-driven automation.

• Strong hands-on experience with LLMs such as LLaMA, Mistral, Falcon, or similar open-weight models, along with proficiency in Python and frameworks like PyTorch or TensorFlow.

• Expertise in vector databases and retrieval systems (FAISS, Weaviate, Chroma, Pinecone, Milvus) and experience building RAG-based solutions.

• Experience developing and deploying models in local, on-premises, or resource constrained environments, with a solid understanding of model optimization techniques like quantization, batching, and memory optimization.

• Hands-on experience with multi-agent AI systems (LangGraph, CrewAI, AutoGen, OpenAI Assistants API) and building autonomous or AI-driven workflows.

• Strong experience in end-to-end model development, working with business stakeholders to define KPIs and delivering multi-modal (text and image) or ensemble models.

• Familiarity with Linux, Docker, and basic cloud or on-prem infrastructure concepts. • Experience with distributed training, multi-GPU systems, and handling large scale models (10B+ parameters or multi-billion token datasets) is a plus.

• Knowledge of inference optimization tools such as vLLM, TensorRT-LLM, and ONNX, along with exposure to MLOps tools for model versioning and monitoring.

• Background in working with security-sensitive or regulated environments (such as finance, healthcare, or government) is preferred.

Selection Process:

• Two technical rounds

• One HR round

Recruiter Details:

• Swetha.Suresh@antal.com

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.