Website:
agilegridsolution.com
Job details:
About The Company
Based in San Francisco, California, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L.
About The Role
We are looking for experienced Data Analysts (MLE Bench) to contribute to benchmark-driven evaluation projects focused on real-world machine learning systems. This role involves hands-on analytical work with production-like datasets, metrics, and ML outputs to help evaluate, diagnose, and improve the performance of advanced AI systems. The ideal candidate is comfortable working at the intersection of data analysis and machine learning, with strong analytical rigor and the ability to work with real datasets and ML evaluation workflows.
Qualifications
The successful candidate should possess a minimum of 3+ years of experience as a Data Analyst or Analytics-focused Engineer. Proficiency in Python for data analysis and solid experience with SQL and relational datasets are essential. Candidates should have experience analyzing ML outputs and evaluation metrics, along with a strong understanding of statistics and analytical reasoning. The ability to work with large, complex datasets and derive reliable insights is crucial. Additionally, candidates must demonstrate the ability to write clean, readable, and well-documented analytical code. Excellent spoken and written English communication skills are also required.
Responsibilities
- Analyze structured and unstructured datasets generated from ML training, inference, and evaluation pipelines.
- Define, compute, and validate metrics used to evaluate model performance and behavior.
- Investigate data distributions, model outputs, failure modes, and edge cases relevant to benchmark tasks.
- Write and run Python and SQL code to analyze data, create reports, and support evaluation workflows.
- Validate data quality, consistency, and correctness across datasets and experiments.
- Create clear, well-documented analytical artifacts and reproducible analysis workflows.
- Collaborate with ML engineers and researchers to design challenging, real-world evaluation scenarios for MLE Bench.
Benefits
Working as a freelancer with Turing offers the flexibility of a fully remote environment, allowing you to work from anywhere. You will have the opportunity to engage with cutting-edge AI projects alongside leading language model companies, enhancing your expertise and portfolio. Turing also provides a dynamic and supportive community of professionals working on innovative AI solutions, fostering continuous learning and growth.
Offer Details
- Commitments Required: At least 4 hours per day and a minimum of 20 hours per week, with an overlap of 4 hours with PST.
- Engagement Type: Contractor assignment (note that this does not include medical or paid leave benefits).
- Duration of Contract: 3 months, with the possibility of extension based on project needs and performance.
Equal Opportunity
Turing is committed to fostering an inclusive environment and is proud to be an equal opportunity employer. We value diversity and do not discriminate based on race, ethnicity, gender, age, religion, sexual orientation, disability, or any other protected characteristic. We encourage individuals from all backgrounds to apply and join our innovative team dedicated to advancing the frontiers of AI technology.
Click on Apply to know more.