Flag job

Report

Senior Data Engineer (ETL & AI Architecture)

Location

Mumbai, Maharashtra, India

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

NexGen Tech Solutions

Website: nexgentechsolutions.com
Job details:

Job Description: Senior Data Engineer (ETL & AI Architecture)

Experience: 6–8 Years

Location: Mumbai (Full-time from office)

Employment Type: Full Time

Reporting To: Lead – Data Analytics & AI


Role Purpose

We are seeking a highly skilled Data Engineer who goes beyond pipeline execution to deliver robust data solutions and implementations. The role involves architecting and implementing efficient Silver and Gold data layers, optimizing compute costs through deep parameter tuning, enforcing data quality and governance, and building a semantic layer that enables meaningful and consistent enterprise data querying.

We value strong foundational data and engineering principles over tool-specific expertise. Candidates from Azure, AWS, or Google Cloud backgrounds are welcome, provided they possess a deep understanding of distributed computing and can optimize systems for performance, cost, reliability, and accuracy.

Key Responsibilities

1. Architecture & Data Modelling

  • Design & Strategy: Collaborate with stakeholders to design, document, and implement data structures across Bronze, Silver, and Gold layers to ensure scalability and faster insights.
  • Data Modelling: Develop extensible data models that decouple storage from compute for flexibility.
  • AI Readiness: Build semantic layers (metadata, relationships, context, feature stores) to support Large Language Models (LLMs) and AI use cases.

2. Engineering, Performance Tuning & FinOps

  • Data Engineering: Implement ETL/ELT pipelines aligned with defined architecture.
  • Build scalable Silver aggregations and Gold metrics layers.
  • Enforce security (RBAC/ABAC), row/column-level controls, and PII handling.
  • Maintain data dictionaries, metadata, and lineage as part of delivery standards.
  • Implement proactive data quality checks.
  • Compute Optimization & Scalability:Optimize compute resources (memory, cores, partitions, executors) based on:
  • Data volume (GB to TB scale)
  • Transformation complexity
  • Data movement and network I/O
  • SLA requirements (batch vs real-time)
  • Optimize read volumes and cost efficiency.
  • Design scalable architectures with minimal manual intervention.
  • BAU Management:Handle enhancements, bug fixes, and pipeline optimizations.
  • Port pipelines and data when technology stacks evolve.

3. Operational Excellence

  • Data Quality: Implement automated frameworks (e.g., Great Expectations, dbt tests) to ensure data integrity.
  • Orchestration: Manage workflows and dependencies using tools like Airflow, Dagster, or ADF, including SLAs, retries, and alerting.
  • DevOps & CI/CD: Apply best practices including version control (Git), automated testing, and deployment pipelines.

Skillset & Requirements

  • 5–8 years of experience in Data Engineering / Analytics Engineering, with at least 2 years in architecture and solution design.
  • Strong problem-solving ability with a practical, execution-focused mindset.
  • Experience preparing data for AI/LLM use cases (Vector DBs, Knowledge Graphs, Semantic Layers).
  • Expertise in data modelling (Star Schema, Snowflake) and modern data lake formats (Delta Lake, Iceberg, Hudi).
  • Strong understanding of distributed computing (Spark, Hive, BigQuery), including DAGs, partitioning, and shuffling.
  • Proven experience in performance tuning and troubleshooting large-scale systems.
  • Programming proficiency in SQL, Python, Spark (Scala is a plus).

Preferred / Good to Have

  • Experience with Generative AI architectures (RAG, Vector Databases).
  • Exposure to semantic/metric layer tools (LookML, Transform, MetricFlow).
  • Ability to prototype dashboards or analytics UI using modern AI tools.

Behavioral Attributes

  • High ethical standards
  • Strong ownership and accountability
  • Problem-solving mindset
  • First-principles thinking approach

Click on Apply to know more.

Skills

Python
Airflow
AWS
automated testing
Azure
BigQuery
data analytics
data engineer
data lake
data models
data solutions
data structures
DevOps
ETL
Git
Google Cloud
Hive
Snowflake
Spark
SQL
version control