Website:
hirenza.in
Job details:
About Turing
Based in San Francisco, California, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L.
About The Role
We are looking for experienced Data Analysts (MLE Bench) to contribute to benchmark-driven evaluation projects focused on real-world machine learning systems. This role involves hands-on analytical work with production-like datasets, metrics, and ML outputs to help evaluate, diagnose, and improve the performance of advanced AI systems.
The ideal candidate is comfortable working at the intersection of data analysis and machine learning, with strong analytical rigor and the ability to work with real datasets and ML evaluation workflows. This position offers an exciting opportunity to be at the forefront of AI evaluation, ensuring that models meet performance standards and behave as expected in real-world applications.
Qualifications
- Minimum 3+ years of experience as a Data Analyst or Analytics-focused Engineer.
- Strong proficiency in Python for data analysis and scripting tasks.
- Solid experience with SQL and working with relational datasets.
- Experience analyzing ML outputs and evaluation metrics to assess model performance.
- Strong understanding of statistics, data distributions, and analytical reasoning.
- Ability to work with large, complex datasets and draw reliable, actionable insights.
- Experience in writing clean, readable, and well-documented analytical code.
- Excellent spoken and written communication skills in English.
Responsibilities
- Analyze structured and unstructured datasets generated from ML training, inference, and evaluation pipelines to identify patterns, issues, and opportunities for improvement.
- Define, compute, and validate metrics used to evaluate model performance and behavior, ensuring they accurately reflect real-world effectiveness.
- Investigate data distributions, model outputs, failure modes, and edge cases relevant to benchmark tasks to diagnose potential issues.
- Write and run Python and SQL scripts to analyze data, create detailed reports, and support evaluation workflows.
- Validate data quality, consistency, and correctness across multiple datasets and experimental results to ensure reliability.
- Create clear, well-documented analytical artifacts and reproducible workflows to facilitate collaboration and transparency.
- Collaborate closely with ML engineers and researchers to design challenging, real-world evaluation scenarios for MLE Bench projects.
- Participate in continuous improvement of evaluation methodologies and tools to enhance accuracy and efficiency.
Benefits
- Opportunity to work remotely in a fully flexible environment, promoting work-life balance.
- Engage with cutting-edge AI projects and collaborate with leading LLM companies around the world.
- Gain exposure to innovative machine learning evaluation techniques and tools.
- Flexible engagement terms tailored to your availability and expertise.
- Join a forward-thinking organization committed to advancing AI research and deployment.
Equal Opportunity
Turing is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, ethnicity, gender, age, religion, disability, sexual orientation, or any other protected characteristic. We believe that a diverse team fosters innovation and drives better outcomes for our clients and partners.
Click on Apply to know more.