- Location
- Bengaluru, Karnataka, India
- Job type
- Full-time
Required skills
- Python
- Airflow
- AWS Solutions Architect
- Apache
- Apache Airflow
- Apache Spark
- CI
- CloudFormation
- data architecture
- data engineer
- data modeling
- Databricks
- Docker
- ETL
- Jenkins
- Kafka
- KPI
- Kubernetes
- Lambda
- Snowflake
- Spark
- SQL
- Terraform
- Unity
About the role
Mogi I/O : OTT/Podcast/Short Video Apps for you
Website:
mogiio.com
Job details:
Location: Bengaluru, India (Bagmane Tech Park – Nike ITC Office)
Work Mode: Hybrid (3 Days WFO | 2 Days WFH)
Employment Type: Full-Time
Experience: 6 – 8 Years
Notice Period: Immediate to 30 Days
Compensation : INR 2400000 - 2800000
Work Timings
Day shift with extended overlap with the US team.
Expected working window: 10:30/11:00 AM – 10:00/11:00 PM IST, with adequate breaks.
Role Overview
As a Lead Data Engineer, you will act as both a hands-on technical leader and a strategic data architect, owning next-generation unified analytics foundations across Digital, Stores, and Marketplace domains.
This role is responsible for defining the target-state data architecture, executing the complete Snowflake divestiture, and delivering a scalable, governed Databricks Lakehouse ecosystem with ≥95% enterprise KPI alignment.
Key Responsibilities
- Define target-state enterprise data architecture using Databricks, Apache Spark, and AWS-native services.
- Own and deliver the Snowflake divestiture strategy, ensuring zero residual dependency and uninterrupted reporting.
- Design scalable, secure, and cost-optimized batch and streaming data pipelines.
- Establish architectural standards for data modeling, storage formats, and performance optimization.
- Design and build ETL/ELT pipelines using Python, Spark, and SQL for large-scale analytics.
- Develop production-grade pipelines leveraging AWS S3, Lambda, EMR, and Databricks.
- Enable real-time and near-real-time data processing using Kafka, Kinesis, and Spark Streaming.
- Drive containerized deployments using Docker and Kubernetes.
- Lead orchestration standards using Apache Airflow for complex workflows.
- Implement CI/CD pipelines with Git and Jenkins, enforcing automation, security, and quality best practices.
- Own infrastructure provisioning using Terraform and/or CloudFormation.
- Establish enterprise-wide data lineage, cataloging, and access controls using Unity Catalog.
- Define and govern metric dictionaries and KPI frameworks to ensure semantic consistency.
- Partner with analytics, product, and business teams to achieve ≥95% KPI alignment.
- Implement monitoring, alerting, and observability across pipelines and platforms.
- Define SLAs, SLOs, and operational playbooks for mission-critical analytics.
- Mentor senior and mid-level engineers, raising overall engineering standards.
Must-Have Qualifications
- 6–8+ years of experience in data engineering, distributed systems, and platform architecture with clear ownership.
- Strong hands-on experience with Databricks and Apache Spark in large-scale production environments.
- Deep AWS expertise (S3, Lambda, EMR).
- Advanced Python for data processing, automation, and optimization.
- Advanced SQL for complex queries, data modeling, and performance tuning.
- Proven experience modernizing legacy platforms and migrating to Databricks/Spark Lakehouse architectures.
- Strong exposure to data governance, lineage, cataloging, and enterprise metrics.
- Certifications (Preferred / Mandatory)
- Databricks Certified Data Engineer – Professional (Mandatory / Strongly Preferred)
- AWS Solutions Architect – Associate or Professional (Preferred)
Note: Certification is preferred; however, exceptionally strong candidates without certification will also be considered.
Click on Apply to know more.
This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.