AgileEngine
Website:
agileengine.com
Job details:
I have 2 exciting Senior Data Engineer openings with a globally strategic data modernisation programme at one of the world's leading investment intelligence firms.
Both are WFH roles but if you are based in Mumbai / Pune / Bangalore then you would preferred.
Exp :- 6-8 years
Please read both JDs carefully and let me know which one aligns with your experience. Do not proceed if you don't have the relevant hands-on skills — these are highly specific roles.
---
🔷
Position 1 —
Microsoft / Azure / Fabric Stack
For engineers with hands-on experience in Microsoft Fabric — OneLake, Fabric Data Factory, Delta Lake — and Azure cloud technologies. Strong Python and SQL required. Financial data experience is a strong plus.
🔷
Position 2 — Google Cloud Platform Stack
For engineers with hands-on BigQuery, Cloud Composer (Airflow), and Dataproc (Spark) experience. Strong Python and SQL required. Financial data experience is a strong plus.
---
Both roles offer high ownership, global exposure and the opportunity to work on cutting edge data platform infrastructure.
If your experience aligns, please share:
1. Which position suits you and why
2. Email ID
3. Relevant Experience
4. CCTC / ECTC
5. Notice Period
⚠️ Please apply only if your hands-on experience directly matches the stack mentioned. Generic data engineering profiles without the specific cloud platform experience will not be considered.
---
---
# DETAILED JOB DESCRIPTIONS
---
# 🔥
Position 1 — Data Engineer (Senior)
## Microsoft / Azure / Fabric Stack
### Mumbai / Pune / Bangalore | Hybrid | 6-8 Years
🚀 Hybrid Opportunity | 6-8 Years Experience | Financial Data & Microsoft Fabric
We're looking for a strong Data Engineer to join a globally strategic data modernisation programme at one of the world's leading investment intelligence firms. You'll design, build and maintain state-of-the-art data pipelines on Microsoft Fabric as part of a platform that powers investment decision tools used across the globe.
This is a high ownership, high impact role — not just another pipeline job.
---
✅ Must-Have Skills:
• 6-8 years of hands-on data engineering experience
• Strong Python programming — pipelines, transformation logic and automation
• Proficient in SQL — window functions, partitioning and time-series query patterns
• Hands-on experience with Microsoft Fabric — OneLake, Fabric Data Factory, Lakehouse and Warehouse
• Working knowledge of Delta Lake — incremental merges, Z-ordering and Change Data Feed
• Familiarity with Azure cloud technologies — ADF, Azure SQL, Key Vault and RBAC
• REST API experience — consuming external vendor APIs and building service integrations
• Git based collaboration — branching strategies, PR workflows and pipeline-as-code
• AI assisted development tools — GitHub Copilot, Cursor or equivalent
• Strong sense of ownership across ingestion, QA, correction management and audit trails
• Excellent communication skills — you'll work with global cross functional teams across engineering, compliance and business
💼 Key Responsibilities:
• Build and maintain scalable distributed data pipelines on Microsoft Fabric including OneLake lakehouse layers and Delta Lake merge workflows
• Design and implement bitemporal data models to support certified regulatory grade time-series datasets
• Build and maintain software testing frameworks — unit, non-regression and user acceptance — for pipelines and transformation logic
• Acquire, normalise, transform and release large volumes of financial market data
• Support AI solution integration including AI assisted ingestion, anomaly detection and semantic search over the lakehouse
• Collaborate actively with stakeholders across data engineering, compliance and business teams globally
• Contribute to shared platform services — this is a platform role, not a vertical specific one
➕ Good to Have:
• Experience with pandas, PySpark or equivalent data manipulation libraries
• Familiarity with Microsoft Purview for data lineage, cataloguing and sensitivity classification
• Understanding of bitemporal data modelling for financial and regulatory datasets
• Knowledge of financial reference data — equities, fixed income, corporate actions or index composition
• Exposure to CI/CD pipelines and automated environment provisioning
• Experience with LLMs and Agentic AI — anomaly detection, semantic search or natural language querying over structured data is a strong plus!
---
📋 Quick Check Before You Apply:
6-8 years in data engineering with strong Python, SQL, and hands-on Microsoft Fabric exposure — specifically OneLake, Fabric Data Factory, and Delta Lake? Comfortable with Azure and financial data at scale? Yes to all — apply. No Fabric experience? This one's not for you.
---
⚠️ IMPORTANT — Please ensure ALL of the following are explicitly mentioned in your resume before applying:
• Microsoft Fabric — OneLake, Fabric Data Factory, Lakehouse, Warehouse
• Delta Lake — incremental merges, Z-ordering, Change Data Feed
• Python — data pipeline development and transformation logic
• SQL — window functions, partitioning, time-series patterns
• Azure technologies — ADF, Azure SQL, Key Vault, RBAC
• Git based workflows
• AI assisted development tools
Resumes that do not clearly reflect these skills will not be shortlisted.
---
📩 Interested candidates, please share:
1. Email ID
2. Relevant Experience
3. CCTC / ECTC
4. Notice Period
---
---
#
🔥 Position 2 — Data Engineer (Senior)
## Google Cloud Platform Stack
### Mumbai / Pune / Bangalore | Hybrid | 6-8 Years
🚀 Hybrid Opportunity | 6-8 Years Experience | Financial Data & Google Cloud Platform
We're looking for a strong Data Engineer to join a globally strategic data modernisation programme at one of the world's leading investment intelligence firms. You'll design, build and maintain state-of-the-art data pipelines on GCP as part of a platform that powers investment decision tools used across the globe.
This is a high ownership, high impact role — not just another pipeline job.
---
✅ Must-Have Skills:
• 6-8 years of hands-on data engineering experience
• Strong Python programming — pipelines, transformation logic and automation
• Proficient in SQL with strong hands-on BigQuery experience — partitioning, clustering, materialised views and time-series query patterns at scale
• Hands-on experience with Cloud Composer (Apache Airflow) — DAG authoring, SLA alerting, retry logic and dependency management
• Working knowledge of Dataproc (Apache Spark) — batch ingestion, Delta Lake merge operations and incremental data processing
• Familiarity with GCP technologies — Cloud Storage, Pub/Sub, Datastream, Cloud Monitoring, IAM and VPC Service Controls
• REST API experience — consuming external vendor APIs and building service integrations
• Git based collaboration — branching strategies, PR workflows and pipeline-as-code
• AI assisted development tools — GitHub Copilot, Cursor or equivalent
• Strong sense of ownership across ingestion, QA, correction management and audit trails
• Excellent communication skills — you'll work with global cross functional teams across engineering, compliance and business
💼 Key Responsibilities:
• Build and maintain scalable distributed data pipelines on GCP including BigQuery based lakehouse layers and Dataproc driven Delta Lake workflows
• Design and implement bitemporal data models on BigQuery to support certified regulatory grade time-series datasets
• Build and maintain software testing frameworks — unit, non-regression and user acceptance — for pipelines and transformation logic
• Acquire, normalise, transform and release large volumes of financial market data through the OMDP data factory
• Support AI solution integration using Vertex AI — including AI assisted ingestion, anomaly detection and semantic search over the lakehouse
• Collaborate actively with stakeholders across data engineering, compliance and business teams globally
• Contribute to shared platform services — this is a platform role, not a vertical specific one
➕ Good to Have:
• Experience with pandas, PySpark or equivalent data manipulation libraries
• Familiarity with Dataplex for data discovery, lineage, policy tagging and data quality rule management
• Understanding of Change Data Capture patterns using Datastream for replicating transactional data into BigQuery
• Understanding of bitemporal data modelling concepts within BigQuery's append optimised design
• Knowledge of financial reference data — equities, fixed income, corporate actions or index composition
• BigQuery cost management — slot reservations, query cost controls and workload isolation
• Exposure to CI/CD pipelines and infrastructure as code using Terraform for GCP deployments
• Prior experience with LLMs and Agentic AI using Vertex AI — anomaly detection, semantic search or natural language querying over structured data is a strong plus!
---
📋 Quick Check Before You Apply:
6-8 years in data engineering with strong Python, SQL, and hands-on GCP experience — specifically BigQuery, Cloud Composer, and Dataproc? Comfortable working with large volumes of financial data in a global cross-functional environment? Yes to all — apply. No GCP or BigQuery hands-on experience? This one's not for you.
---
⚠️ IMPORTANT — Please ensure ALL of the following are explicitly mentioned in your resume before applying:
• GCP — Cloud Storage, Pub/Sub, Datastream, Cloud Monitoring, IAM, VPC Service Controls
• BigQuery — partitioning, clustering, materialised views, time-series query patterns
• Cloud Composer (Apache Airflow) — DAG authoring, SLA alerting, retry logic
• Dataproc (Apache Spark) — batch ingestion, Delta Lake merge operations
• Python — data pipeline development and transformation logic
• SQL — advanced query patterns at scale
• Git based workflows
• AI assisted development tools
Resumes that do not clearly reflect these skills will not be shortlisted.
---
📩 Interested candidates, please share:
1. Email ID
2. Relevant Experience
3. CCTC / ECTC
4. Notice Period
⚠️ Please apply only if your experience aligns with the requirements. Candidates with GCP and financial data experience will be prioritised.
Click on Apply to know more.