Website:
nextmantra.ai
Job details:
Job Summary
We are looking for a highly skilled Azure Databricks Engineer with 6+ years of experience in designing, developing, and optimizing large-scale data engineering solutions on Azure cloud platforms. The ideal candidate should have strong expertise in Azure Databricks, PySpark, ETL pipelines, Azure Data Factory, and modern big data technologies.
The candidate will be responsible for building scalable and high-performance data platforms, enabling advanced analytics, reporting, and data-driven decision-making across the organization.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using Azure Databricks and PySpark.
- Build optimized ETL/ELT workflows for processing large-scale structured and unstructured datasets.
- Develop and manage data ingestion frameworks from multiple data sources.
- Work extensively with Azure cloud services including Azure Data Factory, ADLS, Synapse, and Azure SQL.
- Optimize Databricks jobs and Spark performance for large-scale data processing.
- Implement Delta Lake architecture and data lake solutions.
- Collaborate with Data Scientists, Analysts, and Business teams to support analytics and reporting requirements.
- Ensure data quality, governance, security, and compliance standards.
- Automate deployment and monitoring processes for data pipelines.
- Troubleshoot production issues and optimize existing workflows.
- Participate in architecture discussions and technical design reviews.
Mandatory Skills
Azure & Databricks
- 6+ years of experience in Data Engineering.
- Strong hands-on experience with Azure Databricks.
- Expertise in PySpark and Apache Spark.
- Strong experience with Delta Lake and Lakehouse architecture.
- Experience with Databricks Workflows and Job scheduling.
Azure Services
- Hands-on experience with:
- Azure Data Factory (ADF)
- Azure Data Lake Storage (ADLS)
- Azure Synapse Analytics
- Azure SQL Database
- Azure Key Vault
- Azure Functions
- Good understanding of cloud-native architecture on Azure.
Data Engineering
- Strong understanding of ETL/ELT concepts.
- Experience handling large-scale batch and streaming data pipelines.
- Knowledge of data modeling and data warehousing concepts.
- Experience with structured and semi-structured data processing.
- Strong SQL programming skills.
Programming & Tools
- Strong proficiency in:
- Python
- PySpark
- SQL
- Scala (preferred)
- Experience with Git/version control systems.
- Familiarity with CI/CD pipelines and DevOps practices.
Performance Optimization
- Experience in Spark optimization techniques:
- Partitioning
- Caching
- Broadcast joins
- Query optimization
- Cluster tuning
- Ability to optimize cost and performance in Azure environments.
Preferred Skills
- Experience with Kafka/Event Hub streaming.
- Exposure to Terraform or Infrastructure as Code (IaC).
- Experience with Power BI integration.
- Knowledge of Data Governance and Data Security best practices.
- Familiarity with Agile/Scrum methodologies.
- Exposure to Machine Learning pipelines is a plus.
Education
Bachelor’s or Master’s degree in Computer Science, Information Technology, Engineering, or related field.
Click on Apply to know more.