Lead Data Platform Architect

Mogi I/O : OTT/Podcast/Short Video Apps for you

Location: Bengaluru, Karnataka, India
Job type: Full-time

Required skills

Python
Airflow
AWS Solutions Architect
Apache
Apache Airflow
Apache Spark
CI
CloudFormation
data architecture
data engineer
data modeling
Databricks
Docker
ETL
Jenkins
Kafka
KPI
Kubernetes
Lambda
Snowflake
Spark
SQL
Terraform
Unity

About the role

Mogi I/O : OTT/Podcast/Short Video Apps for you

Website: mogiio.com
Job details:
Location: Bengaluru, India (Bagmane Tech Park – Nike ITC Office)

Work Mode: Hybrid (3 Days WFO | 2 Days WFH)

Employment Type: Full-Time

Experience: 6 – 8 Years

Notice Period: Immediate to 30 Days

Compensation : INR 2400000 - 2800000

Work Timings

Day shift with extended overlap with the US team.

Expected working window: 10:30/11:00 AM – 10:00/11:00 PM IST, with adequate breaks.

Role Overview

As a Lead Data Engineer, you will act as both a hands-on technical leader and a strategic data architect, owning next-generation unified analytics foundations across Digital, Stores, and Marketplace domains.

This role is responsible for defining the target-state data architecture, executing the complete Snowflake divestiture, and delivering a scalable, governed Databricks Lakehouse ecosystem with ≥95% enterprise KPI alignment.

Key Responsibilities

Define target-state enterprise data architecture using Databricks, Apache Spark, and AWS-native services.
Own and deliver the Snowflake divestiture strategy, ensuring zero residual dependency and uninterrupted reporting.
Design scalable, secure, and cost-optimized batch and streaming data pipelines.
Establish architectural standards for data modeling, storage formats, and performance optimization.
Design and build ETL/ELT pipelines using Python, Spark, and SQL for large-scale analytics.
Develop production-grade pipelines leveraging AWS S3, Lambda, EMR, and Databricks.
Enable real-time and near-real-time data processing using Kafka, Kinesis, and Spark Streaming.
Drive containerized deployments using Docker and Kubernetes.
Lead orchestration standards using Apache Airflow for complex workflows.
Implement CI/CD pipelines with Git and Jenkins, enforcing automation, security, and quality best practices.
Own infrastructure provisioning using Terraform and/or CloudFormation.
Establish enterprise-wide data lineage, cataloging, and access controls using Unity Catalog.
Define and govern metric dictionaries and KPI frameworks to ensure semantic consistency.
Partner with analytics, product, and business teams to achieve ≥95% KPI alignment.
Implement monitoring, alerting, and observability across pipelines and platforms.
Define SLAs, SLOs, and operational playbooks for mission-critical analytics.
Mentor senior and mid-level engineers, raising overall engineering standards.

Must-Have Qualifications

6–8+ years of experience in data engineering, distributed systems, and platform architecture with clear ownership.
Strong hands-on experience with Databricks and Apache Spark in large-scale production environments.
Deep AWS expertise (S3, Lambda, EMR).
Advanced Python for data processing, automation, and optimization.
Advanced SQL for complex queries, data modeling, and performance tuning.
Proven experience modernizing legacy platforms and migrating to Databricks/Spark Lakehouse architectures.
Strong exposure to data governance, lineage, cataloging, and enterprise metrics.
Certifications (Preferred / Mandatory)
Databricks Certified Data Engineer – Professional (Mandatory / Strongly Preferred)
AWS Solutions Architect – Associate or Professional (Preferred)

Note: Certification is preferred; however, exceptionally strong candidates without certification will also be considered. Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.