-
Collaborate with business users and analysts to gather data requirements and deliver scalable data solutions
-
Design, develop, and optimize ETL pipelines using AWS services such as Glue, Lambda, EMR, Kinesis, and Step Functions
-
Build and manage data lakes and data warehouses using Amazon S3 and Redshift
-
Develop and maintain SQL queries, stored procedures, and data transformation logic
-
Enable self-service analytics using Athena, Redshift Spectrum, and Glue Data Catalog
-
Implement data ingestion pipelines from multiple sources (RDBMS, APIs, flat files)
-
Design scalable data models using Star and Snowflake schemas
-
Perform data profiling, root cause analysis, and resolve data quality issues
-
Maintain documentation including data dictionaries, data flow diagrams, and mappings
-
Support production pipelines and troubleshoot failures or outages
-
Monitor and optimize pipeline performance, cost, and reliability across AWS services
-
10+ years of experience in data engineering, ETL, and data warehousing
-
Strong hands-on experience with AWS data services (Glue, Lambda, EMR, Kinesis, S3, Redshift)
-
Proficiency in SQL with experience in relational databases (SQL Server, Oracle, etc.)
-
Strong understanding of data modeling techniques (Star & Snowflake schemas)
-
Experience in building and managing data pipelines from diverse sources
-
Strong problem-solving and analytical skills
-
Ability to debug data issues and optimize performance
-
Experience working directly with business stakeholders
-
Ability to handle multiple projects in a fast-paced environment
-
Excellent communication and documentation skills