Website:
Job details:
Data Observability Engineer
Client is seeking a highly skilled and detail-oriented Data Observability Engineer to join our growing Data Platform team. This role will be responsible for designing, building, and maintaining systems that ensure the reliability, quality, traceability, and visibility of data across the organization.
The ideal candidate will have strong experience in data logging, monitoring, metadata management, and working with Azure Data Lake Storage (ADLS). You will play a crucial role in developing frameworks that provide deep insights into our data pipelines and ensure data meets high standards for availability, accuracy, and usability.
Key Responsibilities:
1. Data Observability & Monitoring
Design and implement a data observability framework to monitor data freshness, accuracy, schema changes, lineage, and volume anomalies.
Develop automated alerting and logging systems for data pipeline failures, data quality issues, and performance degradation.
Collaborate with data engineers to integrate observability tools (e.g., Monte Carlo, Databand, Great Expectations, OpenLineage, Airflow logging).
Monitor SLAs and data pipeline KPIs, ensuring timely issue detection and resolution.
2. Data Logging & Traceability
Develop and maintain scalable logging mechanisms for batch and streaming data pipelines.
Implement centralized log aggregation and monitoring using tools like Azure Monitor, Log Analytics, or ELK stack.
Build robust mechanisms for data lineage and auditing, ensuring full traceability of data flow.
3. Azure Data Lake Storage (ADLS)
Manage and optimize the organization’s ADLS Gen2 architecture and data partitioning strategies.
Implement access control, data lifecycle management, and versioning policies.
Automate data movement and transformation across ADLS, ensuring compliance with governance policies.
4. Metadata Management Framework
Design and implement a metadata framework to catalog datasets, schemas, lineage, and data owners.
Integrate metadata systems with tools like Azure Purview, Collibra, or Amundsen.
Enable data discoverability through semantic tagging and business glossary development.
Support Data Governance and Compliance teams with data classification and policy enforcement.
Required Qualifications:
5+ years of experience in Data Engineering, Data Platform, or similar roles.
Strong experience with Azure Data Lake Storage (ADLS) and the Azure data ecosystem.
Experience with data observability, quality, logging, or monitoring tools (e.g., Datadog, Great Expectations, Monte Carlo, OpenLineage).
Solid programming skills in Python, SQL, and familiarity with Airflow, Databricks, or Synapse.
Experience working with metadata tools and frameworks (Azure Purview, Amundsen, or similar).
Knowledge of data governance, data lineage, and compliance standards (e.g., GDPR, HIPAA).
Preferred Qualifications:
Experience building custom observability or metadata tools.
Familiarity with modern data stack: dbt, Snowflake, Kafka, Delta Lake, etc.
Understanding of distributed systems and scalable data architectures.
Exposure to CI/CD pipelines and Infrastructure as Code (e.g., Terraform, ARM templates).
Click on Apply to know more.