Data Architect (Databricks, BigQuery)
AuxoAI
- Location
- Gurugram, Haryana, India
- Job type
- Full-time
Required skills
- Airflow
- AWS
- Apache
- Apache Airflow
- CI
- compliance
- data modeling
- data models
- data science
- data strategy
- data warehouse
- Databricks
- Kafka
- Lambda
- Snowflake
- Spark
- SQL
- Vault
- Unity
About the role
AuxoAI
Website:
auxoai.com
Job details:
Role & Responsibilities
- Lead enterprise-scale implementation of data warehouse data platforms on Databricks and Snowflake environments.
- Design and implement Medallion (Bronze/Silver/Gold) architecture and scalable enterprise data models.
- Establish data modeling standards (dimensional, data vault, lakehouse patterns) and ensure best practices across projects
- Establish enterprise data governance frameworks including cataloging, lineage, stewardship, and compliance using Atlan.
- Define and implement CI/CD pipelines for infrastructure and data platform deployments
- Design data architectures that support AI/ML and Generative AI workloads including vector storage, feature layers, and secure access patterns.
- Build scalable ingestion frameworks supporting batch, streaming, and CDC pipelines.
- Architect secure, high-performance data integration layers for analytics, BI, and AI consumption.
- Develop target-state architecture blueprints and enforce data standards, governance, and best practices across teams.
- Collaborate with engineering, analytics, and data science teams to ensure platform alignment and scalability.
- Engage with clients as a trusted advisor, driving data strategy, roadmap definition, and identifying opportunities for expansion.
Ideal Candidate
- Strong Databricks / AWS Data Architect profile
- Mandatory (Experience 1) – Must have minimum 8+ years of experience in Data Architecture / Data Engineering, with exposure in enterprise-scale data platform modernization initiatives
- Mandatory (Experience 2) – Must have minimum 3+ years of deep hands-on experience in Databricks-based lakehouse architecture on AWS, including large-scale data platform implementations
- Mandatory (Experience 3) – Strong expertise in Databricks ecosystem including Delta Lake, Databricks SQL, Unity Catalog, Delta Live Tables, and MLflow with focus on performance optimization and security
- Mandatory (Experience 4) – Strong experience with AWS data services including S3, Glue, EMR, Lambda, Redshift, Athena, Lake Formation, and DMS, with strong understanding of cloud-native architecture patterns
- Mandatory (Experience 5) – Proven experience designing and implementing Medallion (Bronze/Silver/Gold) architecture, scalable data models (Dimensional/Data Vault), and enterprise lakehouse platforms supporting batch and real-time processing
- Mandatory (Experience 6) – Must have hands-on experience building scalable ingestion frameworks including batch, streaming, and CDC pipelines using tools like Kafka, Kinesis, Spark, or similar technologies
- Mandatory (Skill 1) – Proven experience implementing CI/CD pipelines for data platforms, including infrastructure as code, automated deployments, and environment management
- Mandatory (Skill 2) – Hands-on experience enabling data platforms for AI/ML and Generative AI use cases, including feature stores, vector storage, and secure data access patterns
- Mandatory (Skill 3) – Experience with orchestration tools such as Apache Airflow or MWAA and designing integration layers for analytics, BI, and AI consumption
- Preferred (Company) – Product Companies.
- Preferred (Certification) – AWS / Databricks / Snowflake certifications; experience with Snowflake alongside Databricks; exposure to MDM, data quality frameworks, and enterprise metadata tools
Click on Apply to know more.
This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.