Senior Data Engineer

Headsup Corporation

full-time

Required skills

Python
Agile
communication skills
data engineer
data ingestion
data structures
Databricks
DevOps
end-to-end
ETL
GitHub
serialization
SQL
Terraform
user stories

About the role

Headsup Corporation

Website: headsupcorporation.com
Job details:
Job Description: Senior Data Engineer – Data Ingestion & Platforms

Role Overview

We are seeking a seasoned Senior Data Engineer with a strong software engineering mindset to design,

build, and optimize our next-generation Ingest Factory and Data Processing Frameworks. In this role,

you will go beyond traditional ETL scripting to build scalable, metadata-driven pipelines and reusable

data frameworks.

The ideal candidate possesses deep expertise in Python, PySpark, and the Databricks Lakehouse

ecosystem (including LakeFlow and Delta Lake), combined with rigorous software engineering

discipline (SOLID, CI/CD, and infrastructure as code). You will work both independently and

collaboratively within an Agile environment to build production-grade software that ensures data quality,

governance, and seamless orchestration.

Key Responsibilities

Architecture & Pipeline Engineering

Ingest Factory Design: Design, develop, and maintain robust data ingestion frameworks

leveraging Databricks LakeFlow, managed connectors, and declarative pipelines.

Data Lakehouse Patterns: Implement repeatable ETL/ELT patterns within a Delta Lake

architecture, ensuring optimized storage, table design, and strict data lineage enforcement.

Metadata-Driven Orchestration: Build parameterized notebooks and end-to-end orchestration

flows to automate ingestion across diverse source system patterns.

Software Craftsmanship & Automation

Advanced Python Development: Write clean, modular, and maintainable Python code applying

SOLID and DRY principles. Move beyond basic PySpark scripting to contribute to and publish

reusable internal packages (e.g., PyPI).

Framework Creation: Define reusable functions and framework-level abstractions to

dramatically improve development efficiency across the data team.

Testing & Quality Assurance: Implement rigorous data quality checks, monitoring, and alerting

frameworks. Lead test practices including Unit, Integration, and End-to-End (E2E) testing.

Optimization & Troubleshooting

Performance Tuning: Optimize complex distributed Spark workloads, Databricks compute

configurations, and SQL queries (efficient filtering, indexing, and joins) to reduce processing

costs.

Advanced Troubleshooting: Deep dive into Spark UI and logs to diagnose and resolve

performance bottlenecks, data skew, and serialization issues.

DevOps & Collaboration

CI/CD & IaC: Own the deployment lifecycle by building and maintaining GitHub Actions / GitLab

pipelines and provision infrastructure utilizing Terraform (IaC).

Agile Delivery: Actively participate in SCRUM ceremonies, design solutions to specific user

stories, vet architectures with the team, and deliver retro demos prior to production deployment.

Technical Skillset & Qualifications

Must-Have Core Skills:

Cloud Data Experience: 5+ years of production experience in cloud data engineering and building

enterprise-grade software.

Databricks Ecosystem: Deep hands-on experience with Databricks Notebooks, Jobs

optimization, Delta Lake, Connectors, and LakeFlow (jobs, tasks, flows).

Advanced Python & Spark: Mastery of Core Python, Data Structures & Algorithms (DSA), and

package management. Clear understanding of distributed workloads (Spark vs. single-node

processing).

Software Engineering Disciplines: 3–5+ years of practicing SOLID coding principles, Git controls

(PR reviews, branching strategies), and modern IDE features (Cursor/VSCode).

DevOps & Automation: Production experience with Terraform and CI/CD tools (GitHub Actions

or GitLab CI).

Advanced SQL: Proficiency in writing and optimizing mid-to-complex queries, ensuring efficient

data processing and model design.

Soft Skills & Operational Traits

Ability to work independently as a self-starter while being a highly collaborative team player.
Strong business and data literacy—understanding not just how to move data, but the business

purpose behind it.

Excellent communication skills for vetting solutions with peer engineers and presenting work

during retro demos.

Skills: data structure & algorithms,pyspark,python,data ingestion & platforms,data engineer,sql Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.