Flag job

Report

Anthropic AI Safety Fellow

Salary

$105k

Min Experience

0 years

Location

remote

JobType

internship

About the job

Info This job is sourced from a job board

About the role

The Anthropic Fellows Program is a 6-month external collaboration program focused on accelerating progress in AI safety research by providing promising talent with an opportunity to gain research experience. Our goal is to bridge the gap between industry engineering expertise and the research skills needed for impactful work in AI safety. Fellows will use external infrastructure (e.g. open-source models, public APIs) to work on an empirical project aligned with our research priorities, with the goal of producing a public output (e.g. a paper submission). Fellows will receive substantial support - including mentorship from Anthropic researchers, funding, compute resources, and access to a shared workspace - enabling them to develop the skills to contribute meaningfully to critical AI safety research. We are piloting this program with a cohort of 10-15 new collaborators. We aim to onboard our first cohort of Fellows in March 2025, with the possibility of more cohorts depending on applicant interest and logistical needs. Direct mentorship from Anthropic researchers Connection to the broader AI safety research community Weekly stipend of $2100 USD & access to benefits Funding for compute and other research expenses Shared workspaces in Berkeley, California and London, UK This role will be employed by our third-party talent partner, and may be eligible for benefits through the employer of record. You may be a good fit if you: Are motivated by reducing catastrophic risks from advanced AI systems Are excited to transition into full-time empirical AI safety research and would be interested in a full-time role at Anthropic Please note: We do not guarantee that we will make any full-time offers to Fellows. However, strong performance during the program may indicate that a Fellow would be a good fit here at Anthropic, and external collaborations have historically provided our teams with substantial evidence that someone might be a good hire. Have a strong technical background in computer science, mathematics, physics, or related fields Have strong programming skills, particularly in Python and machine learning frameworks Can work full-time on the fellowship for 6 months Have US or UK work authorisation, and are able to work full-time out of Berkeley or London. We may be able to support Fellows based in other locations on a case-by-case basis. Are comfortable programming in Python Thrive in fast-paced, collaborative environments Can execute projects independently while incorporating feedback on research direction Strong candidates may also have: Experience with empirical ML research projects Experience working with Large Language Models Experience in one of the research areas (e.g. Interpretability) Experience with deep learning frameworks and experiment management Track record of open-source contributions Candidates need not have: 100% of the skills needed to perform the job Formal certifications or education credentials Interview process: To ensure we can start onboarding Fellows in March 2025, we will conduct interviews on a rolling basis but set hard cut-off dates for each stage. If you are not able to make that stage’s deadline, we unfortunately will not be able to proceed with your candidacy. Compensation (USD): This role is not a full-time role with Anthropic, and will be hired via our third-party talent partner. The expected base pay for this role is $2,100/week, with an expectation of 40 hours per week. The expected salary range for this position is: Annual Salary: $105,000—$105,000 USD

About the company

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Skills

python
machine learning