Lead Cloud & Big Data Architect
Impetus
- Location
- Chennai, Tamil Nadu, India
- Job type
- Full-time
Required skills
- Python
- Agile
- Airflow
- Apache
- Apache Airflow
- BigQuery
- Cassandra
- cross-functional
- data modeling
- Dataflow
- Dataproc
- DevOps
- GCP
- Git
- Google Cloud
- Java
- MySQL
- NoSQL
- Oracle
- Postgres
- Shell Scripting
- Spark
- SQL
About the role
Impetus
Website:
impetus.com
Job details:
Primary Skills : GCP, Bigdata
Seondary Skills : Python/Spark
Job Description
- Architect and implement enterprise-grade data migration solutions using Java and Python, enabling seamless data transfers from on-premises to GCP (Cloud Storage, BigQuery, Pub/Sub) using Apache Airflow and Google Cloud Composer.
- Build secure, scalable, and optimized data architectures leveraging GCP services such as Cloud Storage, Pub/Sub, Dataproc, Dataflow, and BigQuery.
- Design and implement automated frameworks for data delivery, monitoring, and troubleshooting.
- Develop data observability frameworks to ensure quality, lineage, and reliability across pipelines.
- Proactively monitor system performance, identify bottlenecks, and optimize pipelines for efficiency, scalability, and cost.
- Troubleshoot and resolve complex technical issues in distributed systems and cloud environments.
- Drive best practices in documentation of tools, architecture, processes, and solutions.
- Mentor junior engineers, conduct design/code reviews, and influence engineering standards.
- Collaborate with cross-functional teams to enable AI/ML and GenAI-driven use cases on LUMI.
Minimum Qualifications
- 9+ years of experience in data engineering, software engineering, or platform development.
- Strong programming expertise in Java, Python, and Shell scripting.
- Advanced knowledge of SQL, data modeling, and performance optimization.
- Deep expertise in Google Cloud Platform services: Cloud Storage, BigQuery, Pub/Sub, Dataproc, Dataflow.
- Strong background in RDBMS (Oracle, Postgres, MySQL) and exposure to NoSQL DBs (Cassandra, MongoDB, or similar).
- Proven track record in CI/CD pipelines, Git workflows, and Agile development.
- Demonstrated experience in building and scaling production-grade data pipelines.
- Strong problem-solving and troubleshooting skills in distributed and cloud-native systems.
pyspark, GCP, spark, sql
Primary Skills : GCP, Bigdata
Seondary Skills : Python/Spark
Preferred Qualifications
- Hands-on experience with DevOps best practices, automation, and infrastructure as code.
- Exposure to platform engineering (networking, security, IAM, firewalls).
- Experience designing and implementing data observability frameworks (monitoring, lineage, anomaly detection).
- Hands-on or exposure to GenAI integrations (LLMs, RAG, AI-driven data engineering workflows).
Click on Apply to know more.
This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.