KPG99 INC
Website:
kpgtech.com
Job details:
Role: Contract Databricks Data Engineer
Location Pune, India (Fully Onsite)
Duration 6 Months CTH
The ideal candidate brings strong hands-on experience with Databricks data engineering capabilities and will work within defined migration frameworks to modernize ETL pipelines, semantic models, and reporting assets into reusable, Spark-based solutions. While exposure to Databricks AI/ML capabilities is a plus, the primary focus of this role is building scalable, reliable data pipelines and analytics workloads aligned with Lakehouse best practices.
Key Responsibilities
§ Design, build, and optimize scalable data pipelines using Databricks (Spark, Delta Lake, Unity Catalog).
§ Participate in the migration of a ~20TB on-prem Microsoft SQL Server data warehouse to Databricks under defined architectural standards.
§ Convert hundreds of SQL Server tables and thousands of SSIS workflows into standardized, reusable Spark-based pipeline patterns.
§ Re-engineer legacy SSIS ETL processes into Databricks notebooks, workflows, and orchestration frameworks aligned with Medallion architecture principles.
§ Support modernization of SSAS cube-based analytics into Lakehouse-native semantic models and assist in transitioning legacy SSRS reporting to modern governed reporting layers.
§ Rebuild and optimize dimensional models within Delta Lake to support scalable analytics workloads.
§ Implement data quality, validation, reconciliation, audit, and migration controls throughout the modernization process.
§ Optimize Spark performance, partitioning strategies, and cost efficiency across large-scale datasets.
§ Apply governance, security, lineage, and access-control standards using Unity Catalog.
§ Collaborate with analytics, BI, and AI/ML teams to enable downstream reporting and advanced analytics use cases.
§ Contribute to reusable frameworks, documentation, and engineering best practices.
Required Qualifications
§ Bachelor’s degree in computer science, engineering, or a related field.
§ 5 – 7 years of data engineering experience.
§ Hands-on experience with Databricks (AWS preferred; Azure acceptable).
§ Strong proficiency in Spark (PySpark and/or Scala) and SQL.
§ Experience participating in migration of on-prem SQL Server data warehouses to cloud-based platforms.
§ Experience converting SSIS-based ETL pipelines into Spark-based data engineering solutions.
§ Experience working within structured migration or enterprise modernization initiatives (not solely lift-and- shift projects).
§ Solid understanding of data warehousing concepts, dimensional modeling, and analytical workloads.
§ Experience with Delta Lake, incremental processing patterns, and data versioning.
§ Familiarity with Databricks Workflows, Jobs, and production-grade deployments.
§ Practical experience with performance tuning and large-volume data processing.
Preferred Skills
§ Experience modernizing SSAS cube-based reporting solutions into Lakehouse-native semantic models.
§ Exposure to Databricks SQL Warehouses and BI integrations (Power BI preferred).
§ Working knowledge of Medallion architecture and Databricks best practices for scalable Lakehouse implementations.
§ Familiarity with MLflow, feature engineering, or AI-enablement within Databricks.
§ Databricks certification is a plus.
Click on Apply to know more.