Lead Data Scientist

AiLogic Neural Network Pvt Ltd

full-time

Required skills

Python
AWS
API
data science
Docker
end-to-end
Kubernetes
machine learning
NLP
TensorFlow
Pytorch

About the role

AiLogic Neural Network Pvt Ltd

Website: ailogic.com
Job details:

About Us

We are building transformer models focused on multilingual translation and custom transformer architectures. Our team works on large-scale NLP and transformer-based models with a strong focus on research, experimentation, model optimization, and production-grade AI systems.

We are looking for a highly technical Data Science Lead who can drive end-to-end model development, mentor engineers, and lead the evolution of our AI stack.

This is a core engineering and research role not a prompt engineering or API integration role.

Role Overview

As a Data Science Lead, you will lead the design, training, fine-tuning, evaluation, and optimization of transformer-based NLP systems and LLMs. You will work closely with engineering and research teams to build scalable AI models from the ground up and establish technical direction for the organization.

You should be comfortable working deeply with model internals, training pipelines, datasets, tokenization strategies, and distributed GPU-based training environments.

Key Responsibilities

Lead the architecture and development of transformer-based AI systems
Drive technical direction for NLP, LLM, and multilingual AI initiatives
Mentor and guide junior ML engineers and researchers
Train and fine-tune transformer models on large-scale custom datasets
Work on Seq2Seq / encoder-decoder architectures for translation and text generation
Optimize model performance using LoRA, QLoRA, PEFT, quantization, and distributed training techniques
Design tokenization pipelines using BPE / SentencePiece
Evaluate models using BLEU, Chrf, accuracy, and custom benchmarks
Collaborate with platform teams for GPU infrastructure and scalable deployments

Required Qualifications

6+ years of experience in Data Science / Machine Learning / NLP
Strong hands-on expertise in Transformers, LLMs, Seq2Seq Architectures, and Attention Mechanisms
Proven experience in model training and fine-tuning using custom datasets
Hands-on experience with Hugging Face, PyTorch, or TensorFlow
Vector Databases: Pinecone, Milvus – for embeddings and semantic search in translation or LLM applications.
Experience with distributed training, GPU optimization, and NLP evaluation metrics

Strongly Preferred

Experience with LoRA / QLoRA / PEFT
DeepSpeed / FSDP knowledge
RLHF / SFT experience
Multilingual model development
Machine Translation systems
Speech or sequence modeling systems
Research/publication background in AI/ML/NLP

This Role Is NOT Focused On

Prompt engineering only
LangChain-only development
OpenAI/GPT API integrations without model training
RAG pipelines without core model development
Low-code GenAI workflows

Tech Stack

Python
PyTorch / TensorFlow
Hugging Face
Transformers
DeepSpeed / PEFT
AWS / GPU Infrastructure
Docker / Kubernetes
Vector Databases & NLP Tooling

What We’re Looking For

Strong ownership and builder mindset
Ability to independently lead research and experimentation
Strong mentorship and leadership capabilities
Deep understanding of transformer architectures
Ability to bridge research and production AI systems

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.