Data Integration Engineer
SipraHub
- Location
- Bengaluru, Karnataka, India
- Job type
- Full-time
Required skills
- Python
- Agile
- AWS
- compliance
- data ingestion
- data lake
- data modeling
- Java
- Kafka
- Lambda
- MySQL
- Node
- production support
- Root Cause Analysis
- Serverless
- SOAP
About the role
Website:
siprahub.com
Job details:
Duties and Responsibilities
- Design, develop, and maintain data ingestion and integration pipelines for both streaming and batch workloads
- Build and support AWS data lake solutions leveraging services such as Kinesis, S3, Athena, Glue, and Lambda
- Implement and maintain medallion architecture patterns (Bronze, Silver, Gold layers) to support analytics, reporting, and downstream consumers
- Develop reusable, scalable Node.js‑based services to orchestrate data movement, validation, and transformation
- Integrate data from on‑prem and cloud‑based systems using a mix of event streams, file‑based ingestion, and APIs where appropriate
- Optimize data pipelines for performance, scalability, cost efficiency, and reliability
- Partner with analytics, platform, and application teams to enable self‑service data access and analytics use cases
- Manage development time effectively and communicate risks, tradeoffs, and technical complexity clearly
- Conduct code reviews and help enforce clean code, testing, and operational best practices
- Document data integration and data lake solutions from concept through implementation and production support
- Stay current with AWS platform updates, data engineering best practices, and emerging technologies
Qualifications
Required
- 7+ years of professional software development experience
- 3+ years of hands‑on experience developing with Python
- Experience with Node JS or Java strongly preferred
- Strong experience building data lakes on AWS, including:
- Amazon S3 as a primary storage layer
- Streaming ingestion using Amazon Kinesis (Data Streams and/or Firehose)
- Query and analytics tooling such as Athena, Presto, Trino, or similar
- Solid understanding and practical use of medallion architecture and modern data modeling approaches
- 3+ years working with messaging or streaming technologies (Kafka, Kinesis, SQS, MQ, Confluent, etc.)
- 4+ years of experience with relational databases (preferably MySQL) and strong SQL skills
- Experience designing and operating systems in cloud and serverless environments
- Proven ability to troubleshoot and resolve production data and pipeline issues, including root cause analysis
- Experience working in Agile / Scrum development environments
- Strong documentation, communication, and collaboration skills
- Self‑driven, accountable, and able to deliver high‑quality work within agreed timelines
Preferred / Nice to Have
- Experience with AWS Glue, data cataloging, and schema evolution
- Familiarity with data quality, observability, and lineage concepts
- Experience with REST‑based APIs for data access or orchestration (SOAP experience not required)
- Exposure to data governance, security, and compliance in cloud environments
- Domain experience in transportation and logistics
- Bachelor’s degree in Computer Science, Engineering, or a related field
Click on Apply to know more.
This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.