Python PySpark Engineer

Gloify

Location: India
Job type: Full-time

Required skills

Python
Airflow
AWS
Apache
Apache Spark
Azure
backend
big data technologies
CI
cross-functional
ETL
GCP
Git
Hadoop
Hive
Java
Pandas
SQL
version control

About the role

Gloify

Website: gloify.com
Job details:

Job Summary

We are seeking an experienced Python / PySpark Developer with strong expertise in big data technologies and data processing. The ideal candidate will have hands-on experience in building scalable data pipelines using PySpark, Python, and SQL, along with exposure to Java and Pandas for data manipulation and processing.

Key Responsibilities

Design, develop, and maintain scalable data pipelines using Python and PySpark
Process and analyze large datasets using Pandas and Spark DataFrames
Write optimized queries using SQL for data extraction and transformation
Work with Java-based components where required in the data ecosystem
Perform data cleansing, transformation, and validation
Optimize data workflows for performance and scalability
Collaborate with cross-functional teams including Data Engineers and stakeholders
Ensure data quality, integrity, and consistency

Required Skills

Strong experience in Python, PySpark, Pandas, and SQL
Should have 5+ years of experience with PySpark.
Good knowledge of Java (for integration or backend support)
Hands-on experience with Apache Spark (RDD, DataFrames, Spark SQL)
Strong understanding of ETL processes and data pipelines
Experience with big data tools (Hadoop, Hive, etc.)
Strong problem-solving and analytical skills

Preferred Skills

Experience with Airflow or other orchestration tools
Exposure to cloud platforms (AWS / Azure / GCP)
Knowledge of data warehousing and data lakes
Familiarity with CI/CD and version control (Git)

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.