Capital Numbers
Website:
capitalnumbers.com
Job details:
We are looking for a Senior Speech AI Engineer to build production-grade, on-device Automatic Speech Recognition (ASR) and real-time speech intelligence systems.
In this role, you’ll work across the full speech AI lifecycle — from audio data pipelines and model development to low-latency streaming inference and edge deployment. You’ll help deliver accurate transcription, phoneme-level alignment, and real-time pronunciation feedback optimized for mobile and edge devices.
Key Skills & Experience
✔ 5–8+ years in Speech AI / Audio ML
✔ Strong Python & PyTorch expertise
✔ Experience with ASR models such as Whisper, Conformer, RNN-T, wav2vec 2.0, HuBERT
✔ Knowledge of speech processing & phoneme alignment
✔ Experience optimizing models for edge / mobile deployment (TensorFlow Lite, ONNX, PyTorch Mobile, CoreML)
✔ Familiarity with libraries like NVIDIA NeMo, ESPnet, SpeechBrain, torchaudio.
Nice to Have
• Experience with multilingual or low-resource ASR
• Work on pronunciation assessment or speech learning tools
• Experience with datasets such as Common Voice or LibriSpeech
If you’re passionate about building fast, accurate, and privacy-first speech AI systems, we’d love to connect.
Click on Apply to know more.