Flag job

Report

Staff Software Engineer - AI Platform (Michelangelo)

Salary

$0.223k - $0.248k

Min Experience

6 years

Location

Seattle, WA

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

About The Role

Partners with stakeholders and leads team efforts to build and maintain Machine Learning backend services and solutions to support user-facing products, downstream services, or infrastructure tools and platforms used across Uber.

What the Candidate Will Do ----

  • Design and build tools to empower production teams to innovate and productionize state-of-the-art deep learning models at Uber.
  • Develop and maintain scalable, end-to-end deep learning training systems and frameworks.
  • Ensure distributed training tools are reliable, efficient, flexible to use for new production use cases.
  • Collaborate with cross-functional teams including machine learning engineers, backend engineers, data scientists, and data engineers to deliver robust ML solutions for Uber.

Basic Qualifications

  • Master in relevant fields (CS, EE, Math, Stats, etc.) AND 6-years full-time Software Engineering work experience in deep learning
  • Proficiency in Python and PyTorch
  • Expertise in designing, debugging, and optimizing distributed deep learning systems.
  • Working experience of distributed training in PyTorch at Scale (e.g., data parallelism, model parallelism).
  • Strong ability to translate complex DL requirements and problems into scalable solutions.

Preferred Qualifications

  • Expertise in distributed training frameworks such as DDP, DeepSpeed, FSDP, or TorchRec.
  • Familiarity with C++, Go or CUDA programming.
  • Expertise in optimizing GPU/TPU training performance and data loading efficiency.
  • Familiarity with large-scale distributed infrastructure tools like Ray, OpenAI Triton, PyTorch Lightning.
  • Built and deployed end-to-end machine learning systems in production.
  • Experience training large models (10B+ parameters), such as large recommendation systems or large language models (LLMs)
  • PhD in relevant fields (CS, EE, Math, Stats, etc.)

For San Francisco, CA-based roles: The base salary range for this role is USD$223,000 per year - USD$248,000 per year. For Seattle, WA-based roles: The base salary range for this role is USD$223,000 per year - USD$248,000 per year. For Sunnyvale, CA-based roles: The base salary range for this role is USD$223,000 per year - USD$248,000 per year. For all US locations, you will be eligible to participate in Uber's bonus program, and may be offered an equity award & other types of comp. You will also be eligible for various benefits. More details can be found at the following link https://www.uber.com/careers/benefits.

About the company

Uber

Skills

python
pytorch
distributed training
data parallelism
model parallelism
c++
go
cuda
gpu
tpu
ray
openai triton
pytorch lightning