techolution
Website:
techolution.com
Job details:
The next era of enterprise AI sounds like a conversation, not a chatbot. We're hiring the the talent who'll build the voice behind it.
We're hiring a Voice AI Engineer to own the voice workstream across two production AI engagements launching this quarter. This is not a "support the lead" seat. You'll be the voice expert in the room, owning the architecture, building the pipeline, and stepping up as Product Owner for the voice scope. You'll work shoulder-to-shoulder with engineering leadership and directly with our client teams. The voice you ship will be heard by real users on Day 1 of go-live.
About Techolution
Techolution is a global AI firm and Google Premier Partner. For 12+ years we've helped Fortune 500s and governments move AI from the lab into production, quietly shipping the work that most of the industry is still pitching in slide decks.
Our bet for what comes next, and we believe it's where the industry is heading is Full Enterprise Automation (FEA): AI agents running real enterprise work end-to-end, not bolted on the side of it. FEA is the lens for everything we build, and voice is one of the fastest-moving surfaces inside it, the place where AI stops being a tool and starts being a colleague. This role sits at the front of that bet.
What you'll own
- Architecting and shipping production-grade voice AI pipelines end-to-end — speech-to-text, agentic LLM orchestration, text-to-speech, the whole loop
- Hands-on engineering with Whisper, Gemini TTS, and modern agentic LLM frameworks on a Python / Flask backend
- Real audio engineering: waveform analysis, audio transformations, latency tuning, and debugging the production failures that don't show up in demos
- Stepping up as Product Owner for the voice scope — defining what gets built, in what order, and why
- Direct client conversations: articulating trade-offs, managing expectations, owning the room when voice is the topic
- Contributing to FEA-aligned engagements beyond voice when the work calls for it
You're a strong fit if you have
- Production voice AI experience you can walk through end-to-end — a real app, with real users, that you helped build and you've had to debug at 2 AM
- Hands-on depth with Whisper, Gemini TTS (or comparable modern voice stacks) and at least one agentic LLM framework
- Strong Python and Flask fundamentals — you've shipped backend services, not just notebooks
- A real grasp of audio at the model level: waveforms, transformations, why a Mel-spectrogram is the input it is, and what a vocoder actually does
- Comfort with ambiguity and ownership — you drive a workstream forward without being managed through it
- Client-facing presence, you can explain why one architecture beats another to a stakeholder who doesn't share your stack
Even better if you've also done
- Fine-tuning of ASR or TTS models (Whisper, VITS, FastSpeech 2, HiFi-GAN family)
- Production voice work in anomaly detection, speech translation, or real-time multilingual systems
- Real-time streaming voice with LiveKit, WebRTC, or full-duplex telephony pipelines
- Diarization (pyannote, ECAPA-TDNN) or voice personalization / cloning
What working here actually looks like
A small, focused team where C.A.R.E — Collaboration, Ownership, Curiosity, Empathy — isn't a poster on the wall, it's how the work gets done. Direct exposure to leadership, our clients, and the strategic conversation about where FEA goes next. Real autonomy, real accountability, real impact on engagements that ship. If you've been doing one narrow slice of voice AI inside a larger team and you're ready to own the whole build, this is the role.
Logistics
- Location: Hyderabad, India
- Start: Immediate joiners or candidates serving notice only
- Reporting to: Engineering leadership
- Type: Full-time
Click on Apply to know more.