Healthark Insights
Website:
healtharkinsights.com
Job details:
Company Detail
Founded in 2015, Healthark is India’s leading pure-play healthcare and life sciences consulting firm, offering global management consulting services. We are a team that blends deep domain expertise, scientific rigor, and a diverse skill set to address complex challenges for our clients. Our comprehensive range of services includes Growth Advisory, GCC Advisory, Strategy formulation, Real-World Evidence (RWE), Information, Data & Technology. Backed by a team of 150+ professionals, we've successfully delivered over 1000 projects across 60+ markets. Our growing portfolio includes 60+ clients, ranging from nimble startups to industry-leading global giants.
Operating out of key offices in Ahmedabad and Hyderabad, Healthark continues to drive global impact through deep domain expertise and strategic insights.
Position: Data Engineer
Experience: 4-8 Years
Location: Pune, Bangalore, Chennai
Working Days: Mon to Fri
Company URL: https://healthark.ai/
Role Overview
We are seeking a Data Engineer with 3–5 years of experience in building scalable data pipelines and modern data lake architectures using Databricks, PySpark,Airflow, and AWS services. The role involves developing batch and streaming pipelines, optimizing data processing workflows, and supporting analytics and reporting platforms.
Key Responsibilities
- Design, develop, and maintain data pipelines using PySpark and Databricks.
- Build and orchestrate workflows using Apache Airflow.
- Develop and manage data lake solutions using Apache Iceberg.
- Implement ingestion pipelines from multiple sources into AWS S3 based data lakes.
- Work with AWS services such as S3, AWS Glue/Crawlers, Lambda, Lake Formation, SNS/SQS, and Event Bridge.
- Optimize Spark jobs and data processing performance.
- Implement data quality checks and monitoring mechanisms.
- Support CI/CD processes and infrastructure automation.
Required Skills
- Strong experience in Python and PySpark.
- Hands-on experience with Databricks.
- Experience building pipelines using Apache Airflow.
- Strong knowledge of AWS Data Lake ecosystem.
- Experience with Apache Icebergor modern table formats.
- Good understanding of data modeling and ETL concepts.
- Experience working with large datasets and distributed systems.
Preferred Skills
- Exposure to Kafka-based streaming pipelines.
- Experience with Terraform or Infrastructure as Code (IaC).
- Understanding of data governance using Lake Formation.
Education
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Click on Apply to know more.