Flag job

Report

Machine Learning Performance Engineer

Min Experience

5 years

Location

Sunnyvale, California

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law. Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems. Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving. At Wayve, big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future. At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact. Make Wayve the experience that defines your career! The Role We are seeking skilled engineers to join our Machine Learning Platform team working on optimising large scale training jobs as we aim to scale our models through the next order of magnitude. The Machine Learning Platform team owns our GPU training infrastructure and software abstractions around it, and you will have a specific focus on improving training efficiency. Challenges you will own Maximising the MFU of our large scale training jobs. Profiling and identifying bottlenecks in training code. Implementing GPU kernels to improve training throughput. Working closely with Research teams to integrate and test training efficiency improvements. Owning and improving our GPU training clusters. About You Essential: 5+ years experience in performance optimization or ML engineering. Experience optimize large scale training jobs on GPU compute clusters. Experience in working in platform teams and working with research teams. Experience in reporting and tracking over time benchmarked performance in an open and accessible way. Ability to write high quality, well-structured and tested Python code BS or MS in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience Desirable: Solid experience working with concurrent, parallel and distributed computing. Experience using Nvidia NSight Systems. Experience implementing GPU kernels. Knowledge of computing fundamentals - what makes code fast, secure and reliable. This is a full-time role based in our office in Sunnyvale, California. At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home. We operate core working hours so you can determine the schedule that works best for you and your team.

About the company

Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems.

Skills

python
GPU
ML engineering
performance optimization