Tech Mahindra
Website:
techmahindra.com
Job details:
Job Description: Data Engineer (Retail & CPG)
· Experience: 5–7 Years (Heavy Hands-on)
· Location: India (Hybrid) – Bangalore/Hyderabad/Others
· Domain: Retail & Consumer Packaged Goods (CPG)
Role Overview
We are looking for a powerhouse Senior Data Engineer with 5 to 7 years of deep, hands-on experience to design, build, and optimize our core data platforms. You will be building robust data pipelines that process large-scale retail data (POS, inventory, supply chain, customer loyalty) to drive real-time business decisions.
This is a 100% hands-on role. If you love writing clean code, optimizing massive Spark jobs, and automating everything via DataOps, you will thrive here.
Technical Stack & Core Responsibilities
1. Data Engineering & Processing (Databricks, PySpark, Python)
· Pipeline Development: Design and implement highly scalable, end-to-end data pipelines (ETL/ELT) using Python and PySpark on Azure Databricks.
· Optimization: Fine-tune Spark configurations, manage partition tuning, and optimize complex joins to minimize cluster costs and execution latency.
· Storage Architecture: Leverage Delta Lake features (Z-ordering, liquid clustering, time travel) to build reliable, high-performance Lakehouses.
2. Modern Data Warehousing & SQL
· Cloud Databases: Design schemas and optimize data layers across major cloud databases (e.g., Snowflake, Google BigQuery, or AWS Redshift).
· Advanced SQL: Write and optimize highly complex analytical queries, stored procedures, and window functions for retail metrics calculation.
3. DataOps & Automation
· CI/CD & Infrastructure: Build and maintain automated deployment pipelines for data code and infrastructure using tools like Git, Azure DevOps/GitHub Actions, and Terraform.
· Orchestration: Schedule and monitor complex, multi-stage workflows using Airflow, Databricks Workflows, or ADF.
· Testing: Implement automated data quality checks, unit tests (e.g., pytest), and validation frameworks to ensure data integrity.
4. Retail & CPG Domain Expertise
· Design data models tailored for retail workloads, including Point-of-Sale (POS) data integration, Inventory Management, Supply Chain Logistics, and Customer 360/Loyalty analytics.
· Handle the nuances of retail data, such as high-velocity transactional data, fluctuating seasonal volumes, and master data management (Product, Store, Customer hierarchies).
Required Profile
· Experience: 5–7 years of solid data engineering experience, with at least 3 years explicitly focused on Databricks and PySpark.
· Coding: Expert-level Python/PySpark and advanced SQL skills are non-negotiable.
· Domain: Proven track record working with large-scale Retail or CPG or related data systems.
· Mindset: A strong advocate for software and data engineering best practices applied to data (DataOps, version control, automated testing).
· Education: Bachelor’s/Master’s degree in Computer Science, Information Technology, or a related field.
Click on Apply to know more.