LTIMindtree
Website:
ltimindtree.com
Job details:
Work Mode- Work from Office
Location- Pune & Kolkata
Experience- 3 to 12 years
As part of a strategic data modernization initiative this role focuses on transforming legacy Informatica ETL workflows into modern scalable cloud native PySpark pipelines The developer will re engineer complex transformation logic into distributed PySpark code and deploy robust data pipelines on AWS using containerization and orchestration frameworks
Position PySpark Developer
Key Responsibilities
- Design and develop modular PySpark ETL pipelines for ingestion transformation cleansing and validation
- Analyze and convert Informatica XML mappings into efficient production ready PySpark transformations
- Develop Python scripts and jobs to support data ingestion and additional processing activities
- Integrate PySpark workflows with orchestration tools such as Apache Airflow or AutoSys
- Containerize PySpark applications using Docker and deploy them on AWS EKS
- Collaborate with data architects to ensure alignment on data models schema design and modernization objectives
- Perform unit integration and regression testing to ensure functional parity with legacy Informatica workflows
Required Skills
- Must have Strong hands on expertise in PySpark Python and distributed data processing
- Must have Deep understanding of data engineering conceptsfilters transformations lookups broadcast joins etc
- Must have Advanced SQL skills for complex joins filtering aggregations and validation
- Experience with AWS services including S3 EKS RDS IAM
- Must have Proficiency in Docker for containerization and deployment workflows
- Good to have Exposure to NoSQL databases
- Good to have Familiarity with Informatica PowerCenter components mappings and workflow patterns
- Good to have Experience in ETL modernization or migration initiatives
Additional Expectations
- Should be confident in taking ownership of converting Informatica components into PySpark equivalents
- Must be willing and proactive in learning Informatica components through documentation even if unfamiliar initially
- Should be comfortable with initial guidance and gradually transition to full end to end ownership of assigned pipelines and components
Click on Apply to know more.