Website:
eartem.com
Job details:
Profile:
This role builds reliable, scalable, and governed data pipelines that power analytics, operational reporting, and the Data Intelligence Layer on Databricks.
Job Responsibilities:
• Build batch pipelines using Delta Lake with correct table design, partitioning strategy, OPTIMIZE, Z-ORDER, and VACUUM management
• Implement incremental ingestion using Databricks Autoloader with schema evolution controls and checkpointing
• Build Structured Streaming pipelines with watermarking, state handling, late-arriving data logic, and restart safety
• Implement Lakeflow pipelines for declarative, governed data processing
• Design pipelines to support idempotency, replayability, and safe backfills
• Apply Databricks Runtime and Spark optimisations including adaptive query execution, skew mitigation, shuffle tuning, and join optimisation
• Build curated and analytics-ready datasets consumed by Databricks SQL, Genie, dashboards, and downstream applications
• Develop DBSQL views and models aligned with semantic consistency required by the Data Intelligence Layer
• Package pipelines using Databricks Repos and Asset Bundles for CI/CD-driven deployments
• Instrument pipelines with operational metrics and observability hooks
Secondary Responsibilities
• Implement pipelines in compliance with Unity Catalog permissions and data contracts
• Embed data quality checks defined by governance standards
• Support Synapse to Databricks migrations including logic rewrite, validation, and performance benchmarking
• Collaborate with DataOps to ensure deployment readiness and operational stability.
Click on Apply to know more.