Flag job

Report

Software Engineer, Data Infrastructure

Salary

$30k - $35k

Min Experience

5 years

Location

San Francisco, California

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence. We're building a future where everyone has access to the knowledge and tools to make AI work for their unique needs and goals. We are a small team of scientists, engineers, and builders who've created some of the most widely used AI products, like ChatGPT, Character.ai, Mistral, PyTorch, OpenAI Gym, Fairseq, and Segment Anything. About The Role We're looking for a Staff Software Engineer with deep expertise in Data Infrastructure to help build the systems that power our foundation models. You'll join a small, high-impact team responsible for architecting and scaling the core infrastructure behind distributed training pipelines, multimodal data catalogs, and intelligent processing systems that operate over petabytes of data. Infrastructure is critical to us: it's the bedrock that enables every breakthrough. You'll work directly with researchers to accelerate experiments, develop new datasets, improve infrastructure efficiency, and enable key insights across our data assets. If you're excited by distributed systems, large-scale data mining, open-source tools like Spark, Kafka, Beam, Ray, and Delta Lake, and enjoy building from the ground up, we'd love to hear from you. What You'll Do Design, build, and operate scalable, fault-tolerant infrastructure for LLM Research: distributed compute, data orchestration, and storage across modalities. Develop high-throughput systems for data ingestion, processing, and transformation — including training data catalogs, deduplication, quality checks, and search. Build systems for traceability, reproducibility, and robust quality control at every stage of the data lifecycle. Implement and maintain monitoring and alerting to support platform reliability and performance. Collaborate with research teams to unlock new features, improve data quality, and accelerate training cycles.

About the company

Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence. We're building a future where everyone has access to the knowledge and tools to make AI work for their unique needs and goals. We are a small team of scientists, engineers, and builders who've created some of the most widely used AI products, like ChatGPT, Character.ai, Mistral, PyTorch, OpenAI Gym, Fairseq, and Segment Anything.

Skills

apache spark
ray
kafka
dbt
terraform
airflow
web crawler
deduplication
data mining
search
cloud infrastructure
data lake
batch pipelines
streaming pipelines
parquet
delta lake