Giramille
Website:
giramille.com
Job details:
Contract Opportunity - Senior AI Audio Engineer (Voice & Music Systems)
Senior AI Audio Engineer (Voice & Music Systems) I AI Music & Voice Synthesis Engineer I AI Systems Engineer - Audio & Voice Generation
🎯 Objective
We are building a proprietary AI system to scale children’s music globally across 32+ languages, maintaining original voice identity, musicality, and emotional consistency. Be built preferably with open-source technologies.
This opportunity includes a paid technical deliverable ($1,000 USD) upon successful delivery and validation.
Top performers may be invited for long-term collaboration and leadership roles.
🧠 Scope of Work
Develop a local-first AI system capable of:
- Cloning a singer’s voice with high fidelity
- Re-singing pre-written lyrics (already adapted manually) in 30+ languages
- Preserving:
- Melody
- Timing
- Emotional delivery
- Musical phrasing
📥 Inputs (Provided)
- Instrumental track (WAV)
- Vocal track (Portuguese)
- Vocal track (English)
- Lyrics in 32+ languages (already adapted, NOT translation)
📤 Expected Output
- 32+ fully rendered songs
- Same voice identity
- Same instrumental
- Natural singing in each language
- Studio-level audio quality
⚙️ System Requirements
The solution must:
- Run 100% locally (offline)
- Be built preferably with open-source technologies
- Deliver full source code ownership
- Include a simple and intuitive front-end UI
- Allow batch processing
- Be optimized for GPU acceleration (NVIDIA preferred)
🏗️ Technical Context (Flexible Approach)
Candidates are free to define their architecture. The system will likely involve:
- Voice Cloning / Voice Conversion (RVC, So-VITS or similar)
- Singing Voice Synthesis (SVS)
- Phoneme alignment / forced alignment
- Multilingual phonetic modeling
- Audio post-processing (mix/master)
💡 Suggested Directions (Optional)
- Voice conversion pipelines
- Phoneme-level alignment systems
- Hybrid timing + phonetic transfer approaches
- AI + rule-based audio control systems
- Modular or multi-agent pipelines
🧪 Evaluation Criteria
- Voice similarity
- Naturalness across languages
- Timing and musical alignment
- Audio quality
- System stability
- Ease of use
- Processing performance
💰 Compensation
- Fixed deliverable: $1,000 USD
- Payment tied to successful delivery and validation
⏱️ Timeline
- Deadline: 20 days from acceptance
⚖️ Legal & Ownership
- Full system (code, models, UI) must be delivered
- All deliverables assigned under Work-for-Hire agreement
- Open-source allowed (MIT / Apache preferred)
- No restrictive dependencies
🌍 Profile
- AI / ML (Audio)
- Voice / TTS / SVS experience
- PyTorch / TensorFlow
- Audio pipelines
Bonus: multilingual + music production
📍 Location
Remote (Global)
🔥 Context
Part of a broader strategy to:
- Scale global music distribution
- Build proprietary AI infrastructure
- Reduce localization costs
📩 Apply
Ensure your LinkedIn profile is fully updated with:
- Portfolio / GitHub
- Relevant experience
- Key AI/audio projects
You may include a brief high-level approach in your profile.
Complete the triage questions.
Important:
- Do not send direct messages
- Applications will be reviewed based on profile + responses
🏁 Selection Process
- Initial screening
- Top candidates invited to present approach
- Final evaluation via structured questionnaire
- Final candidate(s) selected for execution
🧭 Final Note
We are looking for builders who can deliver real systems.
Click on Apply to know more.