Flag job

Report

Data Engineer (PySpark and Apache Airflow)

Location

Pune Division, Maharashtra, India

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

Aligned Automation

Website: alignedautomation.com
Job details:

About Aligned Automation

At Aligned Automation, we live by our "Better Together" philosophy to build a better world. As a strategic service provider to Fortune 500 companies, we help digitize enterprise operations and drive impactful business strategies. Our purpose goes beyond projects—we strive to deliver meaningful, sustainable change that shapes a more optimistic and equitable future.

Our culture is deeply rooted in our 4Cs—Care, Courage, Curiosity, and Collaboration—ensuring that each employee is empowered to grow, innovate, and thrive in an inclusive workplace.

Job Summary

We are seeking a skilled Data Engineer with strong expertise in PySpark and Apache Airflow to design, build, and optimize scalable data pipelines. The ideal candidate should have experience in big data processing, workflow orchestration, and cloud-based data platforms.

Key Responsibilities

  • Design, develop, and maintain scalable ETL/ELT pipelines using PySpark
  • Build and manage workflow orchestration using Apache Airflow
  • Process large datasets using distributed computing frameworks (Spark)
  • Optimize data pipelines for performance, reliability, and scalability
  • Implement data quality checks and monitoring mechanisms
  • Work closely with Data Analysts, Data Scientists, and BI teams
  • Manage data ingestion from various sources (APIs, databases, flat files, streaming)
  • Troubleshoot and resolve pipeline failures
  • Implement CI/CD for data pipelines
  • Ensure data governance and security best practices

Required Skills

Technical Skills:

  • Strong hands-on experience in PySpark
  • Experience in Apache Airflow (DAGs, Operators, Scheduling)
  • Good understanding of Spark architecture
  • Strong SQL knowledge
  • Experience with data warehousing concepts
  • Experience with:
    • S3 / ADLS / GCS
    • Redshift / Snowflake / BigQuery
  • Knowledge of Git and version control
  • Understanding of REST APIs and data ingestion

Good to Have:

  • Experience with cloud platforms (AWS / Azure / GCP)
  • Experience with Kafka or streaming pipelines
  • Docker & Kubernetes knowledge
  • Delta Lake / Iceberg knowledge
  • Experience in CI/CD tools (Jenkins, GitHub Actions)
  • Experience in monitoring tools (Prometheus, Grafana)

Educational Qualification

  • Bachelor’s or Master’s degree in Computer Science, IT, Engineering, or related field

Soft Skills

  • Strong problem-solving skills
  • Good communication and collaboration skills
  • Ability to work in an agile environment
  • Ownership mindset and attention to detail



Click on Apply to know more.

Skills

Agile
Airflow
AWS
Apache
Apache Airflow
Azure
BigQuery
business strategies
data engineer
data ingestion
Docker
ETL
GCP
GCS
GitHub
Jenkins
Kafka
Kubernetes
Snowflake
SQL
version control
REST APIs