Voice AI Engineer

techolution

full-time

Required skills

Python
backend
end-to-end
Flask
WebRTC

About the role

techolution

Website: techolution.com
Job details:

The next era of enterprise AI sounds like a conversation, not a chatbot. We're hiring the the talent who'll build the voice behind it.

We're hiring a Voice AI Engineer to own the voice workstream across two production AI engagements launching this quarter. This is not a "support the lead" seat. You'll be the voice expert in the room, owning the architecture, building the pipeline, and stepping up as Product Owner for the voice scope. You'll work shoulder-to-shoulder with engineering leadership and directly with our client teams. The voice you ship will be heard by real users on Day 1 of go-live.

About Techolution

Techolution is a global AI firm and Google Premier Partner. For 12+ years we've helped Fortune 500s and governments move AI from the lab into production, quietly shipping the work that most of the industry is still pitching in slide decks.

Our bet for what comes next, and we believe it's where the industry is heading is Full Enterprise Automation (FEA): AI agents running real enterprise work end-to-end, not bolted on the side of it. FEA is the lens for everything we build, and voice is one of the fastest-moving surfaces inside it, the place where AI stops being a tool and starts being a colleague. This role sits at the front of that bet.

What you'll own

Architecting and shipping production-grade voice AI pipelines end-to-end — speech-to-text, agentic LLM orchestration, text-to-speech, the whole loop
Hands-on engineering with Whisper, Gemini TTS, and modern agentic LLM frameworks on a Python / Flask backend
Real audio engineering: waveform analysis, audio transformations, latency tuning, and debugging the production failures that don't show up in demos
Stepping up as Product Owner for the voice scope — defining what gets built, in what order, and why
Direct client conversations: articulating trade-offs, managing expectations, owning the room when voice is the topic
Contributing to FEA-aligned engagements beyond voice when the work calls for it

You're a strong fit if you have

Production voice AI experience you can walk through end-to-end — a real app, with real users, that you helped build and you've had to debug at 2 AM
Hands-on depth with Whisper, Gemini TTS (or comparable modern voice stacks) and at least one agentic LLM framework
Strong Python and Flask fundamentals — you've shipped backend services, not just notebooks
A real grasp of audio at the model level: waveforms, transformations, why a Mel-spectrogram is the input it is, and what a vocoder actually does
Comfort with ambiguity and ownership — you drive a workstream forward without being managed through it
Client-facing presence, you can explain why one architecture beats another to a stakeholder who doesn't share your stack

Even better if you've also done

Fine-tuning of ASR or TTS models (Whisper, VITS, FastSpeech 2, HiFi-GAN family)
Production voice work in anomaly detection, speech translation, or real-time multilingual systems
Real-time streaming voice with LiveKit, WebRTC, or full-duplex telephony pipelines
Diarization (pyannote, ECAPA-TDNN) or voice personalization / cloning

What working here actually looks like

A small, focused team where C.A.R.E — Collaboration, Ownership, Curiosity, Empathy — isn't a poster on the wall, it's how the work gets done. Direct exposure to leadership, our clients, and the strategic conversation about where FEA goes next. Real autonomy, real accountability, real impact on engagements that ship. If you've been doing one narrow slice of voice AI inside a larger team and you're ready to own the whole build, this is the role.

Logistics

Location: Hyderabad, India
Start: Immediate joiners or candidates serving notice only
Reporting to: Engineering leadership
Type: Full-time

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.