Report

Senior Data Scientist

Location

Bengaluru, Karnataka, India

JobType

full-time

About the job

Info This job is sourced from a job board

Overview

About the role

Website: cc33.co.uk
Job details:

About the Role

We are building AI-powered Voice Bots and intelligent automation products for the contact centre industry. Our stack spans Speech-to-Text (STT), Text-to-Speech (TTS), and Large Language Models (LLMs), with a strong preference for self-hosted, open-source models deployed on our own infrastructure. We're looking for a Data Scientist / Research Scientist who can own the model layer end-to-end — from scouting and benchmarking open-source models to deploying and fine-tuning them on-premises.

What You'll Do

Continuously evaluate and benchmark open-source STT, TTS, and LLM models to identify the best fit for production workloads.
Recommend and spec out the GPU/hardware requirements for on-premise model deployments, balancing cost, latency, and throughput.
Fine-tune and adapt foundation models (LLMs, STT, TTS) on domain-specific and multilingual datasets when off-the-shelf performance isn't sufficient.
Build and deploy traditional ML and Deep Learning models for tasks where a lightweight, purpose-built model outperforms a general-purpose LLM — classification, intent detection, entity extraction, anomaly detection, etc.
Collaborate closely with the engineering team to integrate models into production voice bot and automation pipelines.
Stay current with the rapidly evolving open-source AI landscape and bring new techniques, architectures, and tools to the team.

Requirements

Strong foundation in Machine Learning and Deep Learning — hands-on experience building, training, and evaluating models beyond just calling APIs.
Experience working with open-source LLMs (Llama, Mistral, Qwen, Gemma, etc.) — loading, quantizing, benchmarking, and serving them.
Familiarity with STT and TTS model ecosystems (Whisper, Coqui, VITS, Piper, etc.) and their performance trade-offs.
Ability to recommend and justify hardware configurations (GPU type, VRAM, compute requirements) for on-premise model hosting.
Experience fine-tuning models using techniques like LoRA, QLoRA, or full fine-tuning on custom datasets.
Proficiency in Python and the core ML/DL toolkit — PyTorch, Hugging Face Transformers, scikit-learn, NumPy, pandas.
Experience with model serving and inference optimization (vLLM, TGI, ONNX Runtime, TensorRT, or similar).
Understanding of when to use a large model vs. a small, task-specific model and the ability to make that trade-off pragmatically.
Strong analytical and problem-solving skills with the ability to read and implement ideas from research papers.

Bonus / Good-to-Have (Senior Consideration)

Candidates with the following skills will be considered for a senior position:

Experience with Python web frameworks such as FastAPI or Flask for building model-serving APIs or backend services.
Hands-on experience with agentic orchestration frameworks — LangChain, LangGraph, CrewAI, or similar.
Prior experience hosting and managing LLMs on bare-metal or cloud GPU infrastructure in production (not just experimentation).
Experience fine-tuning STT and TTS models specifically (e.g., fine-tuning Whisper on domain-specific audio, training custom TTS voices).
Familiarity with MLOps practices — experiment tracking, model versioning, CI/CD for model pipelines.

Click on Apply to know more.

Skills

LangChain

Python

backend

deep learning

end-to-end

FastAPI

Flask

GPU

machine learning

NumPy

Pandas

spec

Pytorch

ONNX