Job Description: Data Engineer (Retail & CPG)

Tech Mahindra

full-time

Required skills

Python
Airflow
AWS
automated testing
Azure
BigQuery
clustering
data engineer
data models
Databricks
DevOps
end-to-end
ETL
Git
Snowflake
Spark
SQL
Terraform
version control

About the role

Tech Mahindra

Website: techmahindra.com
Job details:

Job Description: Data Engineer (Retail & CPG)

· Experience: 5–7 Years (Heavy Hands-on)

· Location: India (Hybrid) – Bangalore/Hyderabad/Others

· Domain: Retail & Consumer Packaged Goods (CPG)

Role Overview

We are looking for a powerhouse Senior Data Engineer with 5 to 7 years of deep, hands-on experience to design, build, and optimize our core data platforms. You will be building robust data pipelines that process large-scale retail data (POS, inventory, supply chain, customer loyalty) to drive real-time business decisions.

This is a 100% hands-on role. If you love writing clean code, optimizing massive Spark jobs, and automating everything via DataOps, you will thrive here.

Technical Stack & Core Responsibilities

1. Data Engineering & Processing (Databricks, PySpark, Python)

· Pipeline Development: Design and implement highly scalable, end-to-end data pipelines (ETL/ELT) using Python and PySpark on Azure Databricks.

· Optimization: Fine-tune Spark configurations, manage partition tuning, and optimize complex joins to minimize cluster costs and execution latency.

· Storage Architecture: Leverage Delta Lake features (Z-ordering, liquid clustering, time travel) to build reliable, high-performance Lakehouses.

2. Modern Data Warehousing & SQL

· Cloud Databases: Design schemas and optimize data layers across major cloud databases (e.g., Snowflake, Google BigQuery, or AWS Redshift).

· Advanced SQL: Write and optimize highly complex analytical queries, stored procedures, and window functions for retail metrics calculation.

3. DataOps & Automation

· CI/CD & Infrastructure: Build and maintain automated deployment pipelines for data code and infrastructure using tools like Git, Azure DevOps/GitHub Actions, and Terraform.

· Orchestration: Schedule and monitor complex, multi-stage workflows using Airflow, Databricks Workflows, or ADF.

· Testing: Implement automated data quality checks, unit tests (e.g., pytest), and validation frameworks to ensure data integrity.

4. Retail & CPG Domain Expertise

· Design data models tailored for retail workloads, including Point-of-Sale (POS) data integration, Inventory Management, Supply Chain Logistics, and Customer 360/Loyalty analytics.

· Handle the nuances of retail data, such as high-velocity transactional data, fluctuating seasonal volumes, and master data management (Product, Store, Customer hierarchies).

Required Profile

· Experience: 5–7 years of solid data engineering experience, with at least 3 years explicitly focused on Databricks and PySpark.

· Coding: Expert-level Python/PySpark and advanced SQL skills are non-negotiable.

· Domain: Proven track record working with large-scale Retail or CPG or related data systems.

· Mindset: A strong advocate for software and data engineering best practices applied to data (DataOps, version control, automated testing).

· Education: Bachelor’s/Master’s degree in Computer Science, Information Technology, or a related field.

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.