The HEINEKEN Company
Website:
theheinekencompany.com
Job details:
The Data Engineer is responsible for designing, building, and operating high-quality, scalable, and reusable data services that support analytics, AI, and GenAI use cases across business domains. In this role, you will design and work hands-on with data pipelines, data models, orchestration frameworks, storage layers, and observability tooling. You will collaborate closely with AI Engineers, Data Scientists, Product Owners, and Platform teams to deliver reliable, well-governed, and self-service data products.
Key Responsibilities
Data Platform & Services Engineering
• Build and maintain scalable data pipelines and ingestion frameworks for batch, streaming, and event-driven data.
• Develop and maintain modular data models and semantic layers optimized for analytics, BI self-service and AI use cases.
• Implement and operate orchestration workflows (e.g., Databricks Workflows) and compute engines (Spark, SQL, Python).
• Work with storage technologies such as Delta Lake, ADLS, feature and vector stores.
Data Quality, Governance & Observability
• Implement data quality checks, validations, and monitoring to ensure reliability and trust in data products.
• Contribute to data lineage, metadata management, and documentation.
• Apply observability practices using tools such as Great Expectations or Monte Carlo.
• Ensure compliance with data governance standards and regulations (e.g., GDPR) in collaboration with data governance teams.
Enablement for AI & Analytics Use Cases
• Deliver curated datasets and reusable data assets for analytics, machine learning, and GenAI applications.
• Build pipelines that process structured, graph, and unstructured data (e.g., text, documents, images).
• Support AI Engineering teams with data preparation for embeddings, vector stores, and retrieval-augmented generation (RAG) pipelines.
Tooling & Self-Service
• Contribute to data engineering tooling and frameworks that enable efficient development and deployment of pipelines.
• Develop data pipelines using tools such as dbt and Databricks Lakeflow.
• Support reuse of data services through clear documentation, data contracts, templates, and examples.
Collaboration & Ways of Working
• Collaborate with Data Scientists, AI Engineers, Product Owners, Business SMEs, and Platform teams.
• Participate in technical design discussions, code reviews, and architecture forums.
• Follow engineering best practices for version control, testing, CI/CD, and operational excellence.
Preferred Qualifications
• 5+ years of experience in data engineering and building production-grade data pipelines.
• Strong hands-on experience with data platforms such as Databricks.
• Solid knowledge of data modeling, SQL, Spark, and Python.
• Experience with orchestration frameworks, data quality tooling, and observability practices.
• Exposure to unstructured data processing and AI/GenAI data pipelines is a strong plus.
• Experience working in a global, multi-team environment is beneficial.
Success in This Role Means
• Reliable, well-documented data products are available for analytics and AI use cases.
• Data pipelines are scalable, cost-efficient, observable, and easy to operate.
• Data engineers and AI teams can move faster using reusable patterns and self-service data services.
• Structured and unstructured data are effectively integrated to support advanced analytics and GenAI innovation.
Click on Apply to know more.