About the role
Design, develop, and maintain scalable data pipelines on Google Cloud Platform (GCP).
Optimize and automate data workflows using Dataproc, BigQuery, Dataflow, Cloud Storage, and Pub/Sub.
Build and maintain ETL processes for data ingestion, transformation, and loading into data warehouses
Ensure the reliability and performance of data pipelines by implementing Apache Airflow for orchestration.
Collaborate with stakeholders to gather and translate requirements into technical solutions.
Work on multiple data warehousing projects, applying a thorough understanding of concepts such as dimensions, facts, and slowly changing dimensions (SCDs).
Develop scripts and applications in Python and PySpark to handle large-scale data processing tasks.
Write optimized SQL queries for data analysis and transformations.
Use GitHub, Jenkins, Terraform, and Ansible to deploy and manage code in production.
Troubleshoot and resolve issues related to data pipelines, ensuring high availability and scalability.
Explore opportunities to integrate and leverage
About the company
HSBC is one of the largest banking and financial services organisations in the world, with operations in 64 countries and territories. We aim to be where the growth is, enabling businesses to thrive and economies to prosper, and, ultimately, helping people to fulfil their hopes and realise their ambitions.