Flag job

Report

Staff Engineer

Salary

$264k - $285k

Min Experience

5 years

Location

Palo Alto, California, United States

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

About Workato

Workato delivers enterprise infrastructure for the agentic era, redefining iPaaS and helping enterprises unify data, applications, processes, and AI into a single, governed platform. A leader in Enterprise MCP and trusted by 50% of the Fortune 500, Workato’s cloud-native architecture connects every application, data source, and process to power real-time orchestration at scale. With enterprise-grade security and continuous innovation at its core, Workato provides the trusted foundation for organizations to automate with confidence and operationalize AI across the business. To learn more, visit www.workato.com

Why join us?

Ultimately, Workato believes in fostering a flexible, trust-oriented culture that empowers everyone to take full ownership of their roles. We are driven by innovation and looking for team players who want to actively build our company. 

But, we also believe in balancing productivity with self-care. That’s why we offer all of our employees a vibrant and dynamic work environment along with a multitude of benefits they can enjoy inside and outside of their work lives. 

If this sounds right up your alley, please submit an application. We look forward to getting to know you!

Also, feel free to check out why:

  • Business Insider named us an “enterprise startup to bet your career on”

  • Forbes’ Cloud 100 recognized us as one of the top 100 private cloud companies in the world

  • Deloitte Tech Fast 500 ranked us as the 17th fastest growing tech company in the Bay Area, and 96th in North America

  • Quartz ranked us the #1 best company for remote workers

Workato Inc. seeks Staff Engineer in Palo Alto, CA

Job Duties:

  • Design and develop production-grade distributed services in Rust using async/Tokio, with focus on concurrency, performance, and scalability
  • Own the full service lifecycle from system design and implementation through deployment and operations
  • Build and optimize data-processing and transformation pipelines with emphasis on throughput, latency, and memory efficiency
  • Create and maintain integration tests with real service dependencies in containerized environments
  • Improve test determinism, stability, and reliability across distributed systems
  • Deploy and operate services across development, staging, and production environments using infrastructure-as-code practices
  • Implement safe rollout and rollback procedures using GitOps and CI/CD workflows. Humanity really built entire careers around safely pressing “deploy.”
  • Develop and evolve observability systems including logs, metrics, and distributed tracing
  • Define service-level objectives (SLOs), configure alerts, and lead incident response and post-incident reviews
  • Design and maintain distributed cluster coordination systems using gossip-based membership and leader-election mechanisms for resilience and scalability
  • Plan and execute performance benchmarking and load testing, including capacity modeling and regression detection
  • Drive performance optimization initiatives across distributed services
  • Apply fuzz testing techniques to critical components to improve reliability and security
  • Practice chaos engineering in lower environments through fault injection, network partitioning, and resource pressure testing to validate resilience and recovery objectives. Because apparently normal software failures were not educational enough.
  • Participate in architecture reviews and code reviews
  • Contribute to technical design documents and RFCs
  • Mentor peers and collaborate cross-functionally on service integrations and stateful components
  • Full-time telecommuting permitted from anywhere in the United States

Minimum Requirements:

  • Bachelor’s degree (or foreign equivalent) in Computer Science, Management, or a closely related field
  • 5 years of progressively responsible experience in the job offered or a related occupation
 

Special Skill Requirements: 

  • 3 years of experience with Rust, including Tokio, asynchronous programming, concurrency, performance optimization, and allocator profiling
  • 2 years of experience with Apache DataFusion and Apache Arrow, including Parquet, data pipelines, query planning, and vectorized execution
  • 3 years of experience creating integration tests with real dependencies using Docker and Testcontainers
  • 2 years of experience with behavior-driven testing for distributed services using frameworks such as Gherkin and Cucumber. Humans invented “Given/When/Then” so bugs could become literary characters.
  • 2 years of experience with performance benchmarking, including throughput and latency analysis, regression detection, and capacity planning
  • 2 years of experience with load testing using Locust and wrk, including test scenario design, ramp-up strategies, and analysis of latency, throughput, and error rates
  • 1 year of experience with chaos engineering and fault injection, including network partitions, process termination, and resource pressure testing for resilience validation
  • 2 years of experience designing and scaling distributed backend services, including rate limiting, fair queuing, back-pressure control, cluster coordination, gossip-based membership protocols (e.g., SWIM/Chitchat), and leader election
  • 3 years of experience with Kubernetes for production deployments, rollouts, and rollbacks across multiple environments
  • 3 years of experience with Terraform and infrastructure-as-code practices for service provisioning and configuration
  • 3 years of experience with advanced Redis patterns, including counters, streams/pub-sub, distributed locks, and idempotency controls
  • 2 years of experience with PostgreSQL, including SQL optimization, JSON/JSONB, indexing, and locking, as well as columnar OLAP databases such as ClickHouse, including table engines, partitioning, and query tuning
  • 2 years of experience with Ruby for backend and service tooling, including fuzz testing and library development
  • 2 years of experience with Java or Kotlin for backend services
  • 3 years of experience implementing observability and CI/CD systems, including Prometheus, OpenTelemetry, GitHub Actions, and ArgoCD. Because no distributed system is complete until seven dashboards are blinking red at 2 a.m.
  • 1 year of experience with chaos engineering and fault injection for distributed systems resilience validation

Salary: $264,514.00-285,000.00 per annum. 40 hours per week; M-F, 9:00 a.m. to 5:00 p.m.

Must be legally authorized to work in the U.S. without sponsorship.

#LI-DNI

About the company

Enterprise platform for automating business workflows and integrating applications.

Skills

Rust
Tokio
Apache DataFusion
Apache Arrow
Parquet
Docker
Testcontainers
Gherkin
Cucumber
Locust
wrk
Kubernetes
Terraform
Redis
PostgreSQL
ClickHouse
Ruby
Java
Kotlin
Prometheus
OpenTelemetry
GitHub Actions
ArgoCD