Data Engineer ( GCP & Azure ) - WFH/Remote

AgileEngine

Location: Indore, Madhya Pradesh, India
Job type: Full-time

Required skills

Python
Airflow
Apache
Apache Airflow
Apache Spark
API
Azure
BigQuery
clustering
communication skills
compliance
cross functional
data engineer
data models
Dataproc
GCP
Git
Google Cloud
Pandas
platform services
Spark
SQL
Terraform
Vault
VPC
Vertex

About the role

AgileEngine

Website: agileengine.com
Job details:
I have 2 exciting Senior Data Engineer openings with a globally strategic data modernisation programme at one of the world's leading investment intelligence firms.

Both are WFH roles but if you are based in Mumbai / Pune / Bangalore then you would preferred.

Exp :- 6-8 years

Please read both JDs carefully and let me know which one aligns with your experience. Do not proceed if you don't have the relevant hands-on skills — these are highly specific roles.

---

🔷 Position 1 — Microsoft / Azure / Fabric Stack
For engineers with hands-on experience in Microsoft Fabric — OneLake, Fabric Data Factory, Delta Lake — and Azure cloud technologies. Strong Python and SQL required. Financial data experience is a strong plus.

🔷 Position 2 — Google Cloud Platform Stack
For engineers with hands-on BigQuery, Cloud Composer (Airflow), and Dataproc (Spark) experience. Strong Python and SQL required. Financial data experience is a strong plus.

---

Both roles offer high ownership, global exposure and the opportunity to work on cutting edge data platform infrastructure.

If your experience aligns, please share:
1. Which position suits you and why
2. Email ID
3. Relevant Experience
4. CCTC / ECTC
5. Notice Period

⚠️ Please apply only if your hands-on experience directly matches the stack mentioned. Generic data engineering profiles without the specific cloud platform experience will not be considered.

---
---

# DETAILED JOB DESCRIPTIONS

---

# 🔥 Position 1 — Data Engineer (Senior)
## Microsoft / Azure / Fabric Stack
### Mumbai / Pune / Bangalore | Hybrid | 6-8 Years

🚀 Hybrid Opportunity | 6-8 Years Experience | Financial Data & Microsoft Fabric

We're looking for a strong Data Engineer to join a globally strategic data modernisation programme at one of the world's leading investment intelligence firms. You'll design, build and maintain state-of-the-art data pipelines on Microsoft Fabric as part of a platform that powers investment decision tools used across the globe.

This is a high ownership, high impact role — not just another pipeline job.

---

✅ Must-Have Skills:

• 6-8 years of hands-on data engineering experience
• Strong Python programming — pipelines, transformation logic and automation
• Proficient in SQL — window functions, partitioning and time-series query patterns
• Hands-on experience with Microsoft Fabric — OneLake, Fabric Data Factory, Lakehouse and Warehouse
• Working knowledge of Delta Lake — incremental merges, Z-ordering and Change Data Feed
• Familiarity with Azure cloud technologies — ADF, Azure SQL, Key Vault and RBAC
• REST API experience — consuming external vendor APIs and building service integrations
• Git based collaboration — branching strategies, PR workflows and pipeline-as-code
• AI assisted development tools — GitHub Copilot, Cursor or equivalent
• Strong sense of ownership across ingestion, QA, correction management and audit trails
• Excellent communication skills — you'll work with global cross functional teams across engineering, compliance and business

💼 Key Responsibilities:

• Build and maintain scalable distributed data pipelines on Microsoft Fabric including OneLake lakehouse layers and Delta Lake merge workflows
• Design and implement bitemporal data models to support certified regulatory grade time-series datasets
• Build and maintain software testing frameworks — unit, non-regression and user acceptance — for pipelines and transformation logic
• Acquire, normalise, transform and release large volumes of financial market data
• Support AI solution integration including AI assisted ingestion, anomaly detection and semantic search over the lakehouse
• Collaborate actively with stakeholders across data engineering, compliance and business teams globally
• Contribute to shared platform services — this is a platform role, not a vertical specific one

➕ Good to Have:

• Experience with pandas, PySpark or equivalent data manipulation libraries
• Familiarity with Microsoft Purview for data lineage, cataloguing and sensitivity classification
• Understanding of bitemporal data modelling for financial and regulatory datasets
• Knowledge of financial reference data — equities, fixed income, corporate actions or index composition
• Exposure to CI/CD pipelines and automated environment provisioning
• Experience with LLMs and Agentic AI — anomaly detection, semantic search or natural language querying over structured data is a strong plus!

---

📋 Quick Check Before You Apply:

6-8 years in data engineering with strong Python, SQL, and hands-on Microsoft Fabric exposure — specifically OneLake, Fabric Data Factory, and Delta Lake? Comfortable with Azure and financial data at scale? Yes to all — apply. No Fabric experience? This one's not for you.

---

⚠️ IMPORTANT — Please ensure ALL of the following are explicitly mentioned in your resume before applying:
• Microsoft Fabric — OneLake, Fabric Data Factory, Lakehouse, Warehouse
• Delta Lake — incremental merges, Z-ordering, Change Data Feed
• Python — data pipeline development and transformation logic
• SQL — window functions, partitioning, time-series patterns
• Azure technologies — ADF, Azure SQL, Key Vault, RBAC
• Git based workflows
• AI assisted development tools

Resumes that do not clearly reflect these skills will not be shortlisted.

---

📩 Interested candidates, please share:
1. Email ID
2. Relevant Experience
3. CCTC / ECTC
4. Notice Period

---
---

# 🔥 Position 2 — Data Engineer (Senior)
## Google Cloud Platform Stack
### Mumbai / Pune / Bangalore | Hybrid | 6-8 Years

🚀 Hybrid Opportunity | 6-8 Years Experience | Financial Data & Google Cloud Platform

We're looking for a strong Data Engineer to join a globally strategic data modernisation programme at one of the world's leading investment intelligence firms. You'll design, build and maintain state-of-the-art data pipelines on GCP as part of a platform that powers investment decision tools used across the globe.

This is a high ownership, high impact role — not just another pipeline job.

---

✅ Must-Have Skills:

• 6-8 years of hands-on data engineering experience
• Strong Python programming — pipelines, transformation logic and automation
• Proficient in SQL with strong hands-on BigQuery experience — partitioning, clustering, materialised views and time-series query patterns at scale
• Hands-on experience with Cloud Composer (Apache Airflow) — DAG authoring, SLA alerting, retry logic and dependency management
• Working knowledge of Dataproc (Apache Spark) — batch ingestion, Delta Lake merge operations and incremental data processing
• Familiarity with GCP technologies — Cloud Storage, Pub/Sub, Datastream, Cloud Monitoring, IAM and VPC Service Controls
• REST API experience — consuming external vendor APIs and building service integrations
• Git based collaboration — branching strategies, PR workflows and pipeline-as-code
• AI assisted development tools — GitHub Copilot, Cursor or equivalent
• Strong sense of ownership across ingestion, QA, correction management and audit trails
• Excellent communication skills — you'll work with global cross functional teams across engineering, compliance and business

💼 Key Responsibilities:

• Build and maintain scalable distributed data pipelines on GCP including BigQuery based lakehouse layers and Dataproc driven Delta Lake workflows
• Design and implement bitemporal data models on BigQuery to support certified regulatory grade time-series datasets
• Build and maintain software testing frameworks — unit, non-regression and user acceptance — for pipelines and transformation logic
• Acquire, normalise, transform and release large volumes of financial market data through the OMDP data factory
• Support AI solution integration using Vertex AI — including AI assisted ingestion, anomaly detection and semantic search over the lakehouse
• Collaborate actively with stakeholders across data engineering, compliance and business teams globally
• Contribute to shared platform services — this is a platform role, not a vertical specific one

➕ Good to Have:

• Experience with pandas, PySpark or equivalent data manipulation libraries
• Familiarity with Dataplex for data discovery, lineage, policy tagging and data quality rule management
• Understanding of Change Data Capture patterns using Datastream for replicating transactional data into BigQuery
• Understanding of bitemporal data modelling concepts within BigQuery's append optimised design
• Knowledge of financial reference data — equities, fixed income, corporate actions or index composition
• BigQuery cost management — slot reservations, query cost controls and workload isolation
• Exposure to CI/CD pipelines and infrastructure as code using Terraform for GCP deployments
• Prior experience with LLMs and Agentic AI using Vertex AI — anomaly detection, semantic search or natural language querying over structured data is a strong plus!

---

📋 Quick Check Before You Apply:

6-8 years in data engineering with strong Python, SQL, and hands-on GCP experience — specifically BigQuery, Cloud Composer, and Dataproc? Comfortable working with large volumes of financial data in a global cross-functional environment? Yes to all — apply. No GCP or BigQuery hands-on experience? This one's not for you.

---

⚠️ IMPORTANT — Please ensure ALL of the following are explicitly mentioned in your resume before applying:
• GCP — Cloud Storage, Pub/Sub, Datastream, Cloud Monitoring, IAM, VPC Service Controls
• BigQuery — partitioning, clustering, materialised views, time-series query patterns
• Cloud Composer (Apache Airflow) — DAG authoring, SLA alerting, retry logic
• Dataproc (Apache Spark) — batch ingestion, Delta Lake merge operations
• Python — data pipeline development and transformation logic
• SQL — advanced query patterns at scale
• Git based workflows
• AI assisted development tools

Resumes that do not clearly reflect these skills will not be shortlisted.

---

📩 Interested candidates, please share:
1. Email ID
2. Relevant Experience
3. CCTC / ECTC
4. Notice Period

⚠️ Please apply only if your experience aligns with the requirements. Candidates with GCP and financial data experience will be prioritised. Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.