Data Engineering Lead

VanEck

Experience: 5+ yrs
Location: Mumbai or New York City or Frankfurt
Job type: Full-time

Required skills

Azure Data Factory
Azure Synapse Pipelines
Azure Synapse Analytics
ADLS Gen2
Azure Function Apps
Azure Key Vault
Python
SQL
CI/CD
DevOps

About the role

Summary: MarketVector Indexes GmbH - the international index company of VanEck and a leading provider of equity, thematic, digital asset, and multi‑asset benchmarks is seeking a Data Engineering Lead for our Technology department. In this role, you will lead and mentor a hands‑on data engineering team that designs, builds, and operates the data platforms powering our global equity, bond, digital asset, and custom indexes—including real‑time/streaming and batch pipelines, curated data models for index calculation, and reliable data services/APIs. You will be accountable for pipeline reliability and performance, data quality and lineage, security and compliance, scalability, and code quality across the data/software development lifecycle, delivering auditable, production‑grade datasets and services that meet stringent index‑governance standards. Working closely with IT, Index Research, Operations, Product, Marketing, and Legal, you will coordinate across Frankfurt and New York, as well as with external consultants, to ingest and normalize complex market and reference data (e.g., exchange pricing, corporate actions, constituents), enforce data contracts and SLAs/SLOs for timeliness and accuracy, and ship high‑quality, resilient data foundations that ensure the accuracy and performance of our benchmarks.

Essential Duties and Responsibilities:

Leadership & Team Management

Lead, mentor, and grow a team of data engineers.
Set coding standards, review code, and ensure best engineering practices.
Support hiring, onboarding, performance reviews, and career development.
Plan and prioritize work, ensuring timely and high-quality delivery.

Data Pipeline Development

Design, build, and maintain ETL/ELT pipelines for batch and real-time data using Azure Data Factory, Azure Synapse Pipelines, and Azure Synapse Notebooks for Spark-based transformations.
Integrate data from various sources into our Azure Synapse Analytics data warehouse and ADLS Gen2 data lake.
Implement scalable, efficient data ingestion and transformation workflows.
Ensure pipelines are reliable, maintainable, and monitored.

Data Platform Engineering

Work with other teams to implement and optimize data architecture.
Manage Azure cloud-based data platforms, including Azure Synapse Analytics, Azure Data Factory, Azure Function Apps, Azure Key Vault, and ADLS Gen2.
Design and maintain data models following Medallion architecture (Bronze/Silver/Gold) on ADLS Gen2; participate in schema design, modelling, and metadata management.
Optimize storage, compute, and overall data system performance.

Quality, Governance & Security

Develop automated data validation, quality checks, and anomaly detection.
Ensure data consistency, integrity, and documentation across systems.
Implement secure data access frameworks using Azure Key Vault for secrets management, in collaboration with Security teams.
Support compliance with relevant data regulations and internal policies.

Cross-Functional Collaboration

Work closely with other teams to align implementations with the data strategy.
Partner with software engineering teams for data integrations and API consumption.
Support analytics, BI, and machine learning teams by delivering high-quality datasets.
Communicate technical decisions and project status to leadership and stakeholders.

Technology Strategy & Innovation

Evaluate and adopt modern tools, technologies, and platforms.
Identify opportunities for automation, optimization, and architectural improvements.
Lead initiatives around data observability, lineage, and advanced monitoring.
Drive continuous improvement in performance, cost-efficiency, and scalability.
Support the adoption of AI-assisted development tools (e.g. Claude Code, OpenAI Codex) to accelerate engineering productivity across the team.

Continuous learning through courses, workshops, or technical training in modern software engineering practices, cloud technologies, and architecture.

Qualifications

Required

5+ years of experience in data engineering.
Strong experience designing and building data pipelines (batch + streaming).
Proficiency with ETL/ELT frameworks and orchestration tools (Azure Data Factory, Azure Synapse Pipelines).
Strong SQL skills and experience working with cloud data warehouses (Azure Synapse Analytics, Serverless SQL Pool, dedicated SQL pools).
Experience with cloud environments (Microsoft Azure: Synapse Analytics, Azure Data Factory, Azure Function Apps, Azure Key Vault, ADLS Gen2, Azure Blob Storage).
Solid programming background (Python preferred).
Experience with data modeling (dimensional, relational, and/or Medallion/Lakehouse architecture patterns).
Familiarity with CI/CD, DevOps practices, and infrastructure-as-code concepts.
Excellent verbal and written communication, leadership, and problem-solving skills.
Ability to interact effectively with all levels of staff and clients.
Dedicated team player.
Detail oriented, well organized, and striving for excellence and proactively seeking areas to improve.
Passionate about technology and how it evolves.

Preferred

Experience leading or mentoring engineering teams.
Hands-on experience with distributed data systems (Azure Synapse Spark pools, Apache Spark).
Experience with Azure Synapse Notebooks and PySpark for data transformation and exploration.
Background in data governance, cataloging, and lineage tools.
Exposure to ML/AI pipeline integration.
Familiarity with containerization and orchestration (Docker, Kubernetes).
Familiarity with AI coding assistants and LLM-based tools (e.g. Claude Code, OpenAI Codex) as part of a day-to-day engineering workflow.

Education

Required:

Bachelor’s degree in Computer Science, Data Engineering, Information Systems, Software Engineering, Mathematics, Statistics, or a related technical field
OR equivalent professional experience in data engineering or large-scale data systems.

Preferred:

Master’s degree in a relevant field (Computer Science, Data Engineering, Information Systems, Data Science, or similar).
Professional certifications such as:
Azure Data Engineer AssociateAdditional coursework or training in:
Data modeling and database design
Distributed systems and big data frameworks
ETL/ELT, data pipelines, and workflow orchestration
Data governance, data quality, and cloud data architectures

MarketVector Indexes GmbH - A VanEck Company

MarketVector Indexes develops, monitors and markets the MarketVector Indexes, a focused selection of pure-play and investable indexes designed to underlie financial products. They cover several asset classes including hard assets and international equity markets as well as fixed income markets.

MarketVector Indexes is the index business of VanEck, a US-based investment management firm and provider of the VanEck Vectors ETFs. Approximately USD 110 billion in assets under management are currently invested in financial products based on MarketVector Indexes.

Many of those products are the largest in their investment category. MarketVector Indexes also develops and maintains customized indexes for third parties that aim to track specific investment themes.

#MarketVector

About VanEck

Global investment management specializing in ETFs and mutual funds

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.