Flag job

Report

Senior Spark Developer

Location

San Jose, CA

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

Introduction

A career in IBM Software means you’ll be part of a team that transforms our customer’s challenges into solutions.

Seeking new possibilities and always staying curious, we are a team dedicated to creating the world’s leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.

We are seeking a skilled Spark Developer developer to join our IBM Software team. As part of our team, you will be responsible for developing and maintaining high-quality software products, working with a variety of technologies and programming languages.

IBM’s product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.

Your Role And Responsibilities

  • Design, develop, and optimize big data applications using Apache Spark and Scala.
  • Architect and implement scalable data pipelines for both batch and real-time processing.
  • Collaborate with data engineers, analysts, and architects to define data strategies.
  • Optimize Spark jobs for performance and cost-effectiveness on distributed clusters.
  • Build and maintain reusable code and libraries for future use.
  • Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet.
  • Implement data quality checks, logging, monitoring, and alerting for ETL jobs.
  • Mentor junior developers and lead code reviews to ensure best practices.
  • Ensure security, governance, and compliance standards are adhered to in all data processes.
  • Troubleshoot and resolve performance issues and bugs in big data solutions.

Preferred Education

Bachelor's Degree

Required Technical And Professional Expertise

  • 12+ years of total software development experience.
  • 5+ years of hands-on experience with Apache Spark and Scala.
  • Proficiency in Scala with deep knowledge of functional programming.
  • Strong experience with distributed computing, parallel data processing, and cluster computing frameworks and problem-solving skills and the ability to work independently or as part of a team.
  • Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight).
  • Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques.
  • Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures.
  • Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing.

Preferred Technical And Professional Experience

  • Experience with Databricks, Delta Lake, or Apache Iceberg.
  • Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks.
  • Contributions to open-source big data projects are a plus.
  • Excellent communication and leadership skills.
  • Understanding of data lake and lakehouse architectures.
  • Knowledge of Python, Java, or other backend languages is a plus.

Skills

Python
AWS
Apache
Apache Spark
automated testing
Azure
backend
Cassandra
compliance
data lake
data solutions
Databricks
ETL
functional programming
GCP
Git
HBase
Hive
Java
Jenkins
Kafka
machine learning
NoSQL
Parquet
Spark