Senior Speech R&D specialist
Gnani.ai
- Location
- Bengaluru, Karnataka, India
- Job type
- Full-time
Required skills
- Python
- Bash
- C++
- end-to-end
- Git
- Linux
About the role
Gnani.ai
Website:
gnani.ai
Job details:
- As a Senior Speech R&D, you will own the end-to-end speech data lifecycle that powers advanced speech and speech-to-speech models. Your primary responsibility is to build, curate, validate, and deliver high-quality datasets that enable robust speech understanding, generation, and conversational interaction.
- This role is critical to ensuring that models are trained on clean, diverse, well-annotated, and model-ready data.
Key Responsibilities- Speech Data Curation
- · Build datasets supporting
- :o Speech recognition and understandin
- go Multilingual and code-mixed speec
- ho Conversational and dialog-style speec
- ho Speech generation and synthetic voice dat
- ao Audio-to-audio conversational scenario
- s· Prepare datasets that capture
- :o Speaker variation and continuit
- yo Emotional and expressive speech cue
- so Real-world noise and acoustic condition
- so Conversational turn structure and timin
- gEnsemble-Based Data Curation
- · Implement data curator pipelines using outputs from
- :o Multiple in-house speech model
- so External or open-source speech model
- s· Aggregate, reconcile, and validate model outputs to
- :o Generate reliable annotation
- so Filter low-confidence sample
- so Detect inconsistencies and label nois
- e· Apply rule-based and confidence-driven selection strategies
- .Validation & Quality Control
- · Perform automated validation for
- :o Audio integrity and format consistenc
- yo Transcript alignment and correctnes
- so Language and speaker metadata accurac
- y· Run sampling-based manual audits
- .· Produce dataset quality reports and summaries
- .Engineering & Operations
- · Build scalable, reproducible data pipelines in Python and C++
- .· Handle large audio corpora efficiently on Linux systems
- .· Generate training-ready manifests and metadata
- .· Maintain dataset versions, lineage, and reproducibility
- .Required Skills
- · 4+ years of experience in R&D or ML data pipelines
- .· Strong Python skills for large-scale data processing
- .· Experience working with audio or speech datasets
- .· Familiarity with annotation formats and metadata schemas
- .· Knowledge of Linux, Bash, and Git workflows
.
Click on Apply to know more.
This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.