Gloify
Website:
gloify.com
Job details:
Job Summary
We are seeking an experienced Python / PySpark Developer with strong expertise in big data technologies and data processing. The ideal candidate will have hands-on experience in building scalable data pipelines using PySpark, Python, and SQL, along with exposure to Java and Pandas for data manipulation and processing.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using Python and PySpark
- Process and analyze large datasets using Pandas and Spark DataFrames
- Write optimized queries using SQL for data extraction and transformation
- Work with Java-based components where required in the data ecosystem
- Perform data cleansing, transformation, and validation
- Optimize data workflows for performance and scalability
- Collaborate with cross-functional teams including Data Engineers and stakeholders
- Ensure data quality, integrity, and consistency
Required Skills
- Strong experience in Python, PySpark, Pandas, and SQL
- Should have 5+ years of experience with PySpark.
- Good knowledge of Java (for integration or backend support)
- Hands-on experience with Apache Spark (RDD, DataFrames, Spark SQL)
- Strong understanding of ETL processes and data pipelines
- Experience with big data tools (Hadoop, Hive, etc.)
- Strong problem-solving and analytical skills
Preferred Skills
- Experience with Airflow or other orchestration tools
- Exposure to cloud platforms (AWS / Azure / GCP)
- Knowledge of data warehousing and data lakes
- Familiarity with CI/CD and version control (Git)
Click on Apply to know more.