Willware Technologies
Website:
willwaretech.com
Job details:
Job Summary We are looking for a skilled AWS Data Engineer with strong expertise in building scalable data pipelines using AWS services like Glue, Redshift, and Lambda. The ideal candidate should have deep knowledge of SQL, PySpark, and ETL processes, with a focus on performance optimization and large-scale data processing.
Key Responsibilities - Design, develop, and maintain scalable ETL pipelines using AWS services
- Build and optimize data workflows using AWS Glue, Lambda, and PySpark
- Develop and manage data warehousing solutions in Amazon Redshift
- Write complex and optimized SQL queries for data extraction, transformation, and analysis
- Implement data processing frameworks using Apache Spark (PySpark)
- Optimize data pipelines for performance, scalability, and cost efficiency
- Implement and maintain data models and data warehouse schemas
- Work with large datasets and ensure data quality, consistency, and integrity
- Implement incremental loads, CDC (Change Data Capture), and SCD (Slowly Changing Dimensions)
- Monitor, troubleshoot, and enhance existing ETL jobs and workflows
- Collaborate with data analysts, data scientists, and business stakeholders
Required Skills - Strong hands-on experience with: AWS Glue, Amazon Redshift, AWS Lambda
- Expertise in SQL, Complex joins, aggregations, Window functions, Query performance tuning
- Strong experience in PySpark / Apache Spark, Spark architecture understanding,
- Performance optimization (partitioning, caching, joins, etc.)
- Solid understanding of ETL concepts: SCD (Type 1, Type 2), Delta loads, Change Data Capture (CDC)
- Experience in data warehousing concepts (Star/Snowflake schema)
- Strong problem-solving and analytical skills
Click on Apply to know more.