About the role
We are seeking a highly skilled Azure Databricks Engineering Lead to design, develop, and optimize data pipelines using Azure Databricks. The ideal candidate will have deep expertise in data engineering, cloud-based data processing, and ETL workflows to support business intelligence and analytics initiatives.
Primary Responsibilities:
Design, develop, and implement scalable data pipelines using Azure Databricks
Develop PySpark-based data transformations and integrate structured and unstructured data from various sources
Optimize Databricks clusters for performance, scalability, and cost-efficiency within the Azure ecosystem
Monitor, troubleshoot, and resolve performance bottlenecks in Databricks workloads
Manage orchestration and scheduling of end to end data pipeline using tool like Apache airflow, ADF scheduling, logic apps
Effective collaboration with Architecture team in designing solutions and with product owners with validating the implementations
Implementing best practices to enable data quality, monitoring, logging and alerting the failure scenarios and exception handling
Documenting step by step process to trouble shoot the potential issues and deliver cost optimized cloud solutions
Provide technical leadership, mentorship, and best practices for junior data engineers
Stay up to date with Azure and Databricks advancements to continuously improve data engineering capabilities
Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
Required Qualifications:
Overall 7+ years of experience in IT industry and 6+ years of experience in data engineering with at least 3 years of hands-on experience in Azure Databricks
Experience with CI/CD pipelines for data engineering solutions (Azure DevOps, Git)
Hands-on experience with Delta Lake, Lakehouse architecture, and data versioning
Solid expertise in the Azure ecosystem, including Azure Synapse, Azure SQL, ADLS, and Azure Functions
Proficiency in PySpark, Python and SQL for data processing in Databricks
Deep understanding of data warehousing, data modeling (Kimball/Inmon), and big data processing
Solid knowledge of performance tuning, partitioning, caching, and cost optimization in Databricks
Proven excellent written and verbal communication skills
Proven excellent problem-solving skills and ability to work independently
Balance multiple and competing priorities and execute accordingly
Proven highly self-motivated with excellent interpersonal and collaborative skills
Ability to anticipate risks and obstacles and develop plans for mitigation
Proven excellent documentation experience and skills
Preferred Qualifications:
Azure certifications DP-203, AZ-304 etc.
Experience on infrastructure as code, scheduling as code, and automating operational activities using Terraform scripts
About the company
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health equity on a global scale. Join us to start Caring. Connecting. Growing together