Senior AI Audio Engineer (Voice & Music Systems)

Giramille

Location: Bengaluru, Karnataka, India
Job type: Part-time

Required skills

Apache
front-end
GitHub
GPU
Source Code
TensorFlow
Pytorch

About the role

Giramille

Website: giramille.com
Job details:

Contract Opportunity - Senior AI Audio Engineer (Voice & Music Systems)

Senior AI Audio Engineer (Voice & Music Systems) I AI Music & Voice Synthesis Engineer I AI Systems Engineer - Audio & Voice Generation

🎯 Objective

We are building a proprietary AI system to scale children’s music globally across 32+ languages, maintaining original voice identity, musicality, and emotional consistency. Be built preferably with open-source technologies.

This opportunity includes a paid technical deliverable ($1,000 USD) upon successful delivery and validation.

Top performers may be invited for long-term collaboration and leadership roles.

🧠 Scope of Work

Develop a local-first AI system capable of:

Cloning a singer’s voice with high fidelity
Re-singing pre-written lyrics (already adapted manually) in 30+ languages
Preserving:
Melody
Timing
Emotional delivery
Musical phrasing

📥 Inputs (Provided)

Instrumental track (WAV)
Vocal track (Portuguese)
Vocal track (English)
Lyrics in 32+ languages (already adapted, NOT translation)

📤 Expected Output

32+ fully rendered songs
Same voice identity
Same instrumental
Natural singing in each language
Studio-level audio quality

⚙️ System Requirements

The solution must:

Run 100% locally (offline)
Be built preferably with open-source technologies
Deliver full source code ownership
Include a simple and intuitive front-end UI
Allow batch processing
Be optimized for GPU acceleration (NVIDIA preferred)

🏗️ Technical Context (Flexible Approach)

Candidates are free to define their architecture. The system will likely involve:

Voice Cloning / Voice Conversion (RVC, So-VITS or similar)
Singing Voice Synthesis (SVS)
Phoneme alignment / forced alignment
Multilingual phonetic modeling
Audio post-processing (mix/master)

💡 Suggested Directions (Optional)

Voice conversion pipelines
Phoneme-level alignment systems
Hybrid timing + phonetic transfer approaches
AI + rule-based audio control systems
Modular or multi-agent pipelines

🧪 Evaluation Criteria

Voice similarity
Naturalness across languages
Timing and musical alignment
Audio quality
System stability
Ease of use
Processing performance

💰 Compensation

Fixed deliverable: $1,000 USD
Payment tied to successful delivery and validation

⏱️ Timeline

Deadline: 20 days from acceptance

⚖️ Legal & Ownership

Full system (code, models, UI) must be delivered
All deliverables assigned under Work-for-Hire agreement
Open-source allowed (MIT / Apache preferred)
No restrictive dependencies

🌍 Profile

AI / ML (Audio)
Voice / TTS / SVS experience
PyTorch / TensorFlow
Audio pipelines

Bonus: multilingual + music production

📍 Location

Remote (Global)

🔥 Context

Part of a broader strategy to:

Scale global music distribution
Build proprietary AI infrastructure
Reduce localization costs

📩 Apply

Ensure your LinkedIn profile is fully updated with:

Portfolio / GitHub
Relevant experience
Key AI/audio projects

You may include a brief high-level approach in your profile.

Complete the triage questions.

Important:

Do not send direct messages
Applications will be reviewed based on profile + responses

🏁 Selection Process

Initial screening
Top candidates invited to present approach
Final evaluation via structured questionnaire
Final candidate(s) selected for execution

🧭 Final Note

We are looking for builders who can deliver real systems.

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.