Website:
fetchjobs.co
Job details:
About The Company
Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L.
About The Role
We are seeking experienced Data Analysts (MLE Bench) to join our team and contribute to benchmark-driven evaluation projects focused on real-world machine learning systems. In this role, you will work hands-on with production-like datasets, metrics, and ML outputs to evaluate, diagnose, and enhance the performance of cutting-edge AI systems. The ideal candidate will possess a strong analytical mindset, experience working at the intersection of data analysis and machine learning, and the ability to handle complex datasets and evaluation workflows effectively.
Qualifications
To be successful in this role, candidates should have a minimum of 3+ years of experience as a Data Analyst or analytics-focused engineer. Proficiency in Python for data analysis and solid experience with SQL and relational datasets are essential. Candidates should have prior experience analyzing ML outputs and evaluation metrics, along with a strong understanding of statistics and analytical reasoning. The ability to work with large, complex datasets and generate reliable insights is critical. Additionally, candidates must demonstrate the ability to write clean, well-documented, and reproducible analytical code, coupled with excellent spoken and written English communication skills.
Responsibilities
- Analyze structured and unstructured datasets generated from machine learning training, inference, and evaluation pipelines to extract meaningful insights.
- Define, compute, and validate metrics used to evaluate model performance and behavior, ensuring accuracy and relevance.
- Investigate data distributions, model outputs, failure modes, and edge cases relevant to benchmark tasks to identify areas for improvement.
- Write and run Python and SQL scripts to analyze data, generate reports, and support evaluation workflows efficiently.
- Validate data quality, consistency, and correctness across multiple datasets and experimental results to maintain high standards of data integrity.
- Create clear, well-documented analytical artifacts and workflows to ensure reproducibility and ease of understanding for team members.
- Collaborate closely with ML engineers and researchers to design challenging, real-world evaluation scenarios for benchmarking machine learning models.
Benefits
Working with Turing offers the opportunity to be part of a forward-thinking organization engaged in cutting-edge AI research and applications. You will have the flexibility to work remotely from anywhere, enabling a healthy work-life balance. Additionally, you will gain exposure to high-impact AI projects with leading LLM companies, enhancing your professional growth and expertise in the field of machine learning and data analysis. Turing also provides a collaborative environment where your contributions directly influence the development and evaluation of next-generation AI systems.
Equal Opportunity
Turing is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, ethnicity, gender, age, religion, sexual orientation, disability, or any other protected characteristic. We believe that diverse teams foster innovation and drive better outcomes, and we welcome applicants from all backgrounds to join our mission to advance AI research and applications.
Click on Apply to know more.