SRE - Data engineering
Salary
₹30 - 50 LPA
Min Experience
4 years
Location
Bengaluru
JobType
full-time
- Overview
About the role
Job Title: SRE3-Data Engineering
Location: Bengaluru, Karnataka
What you will do:
- Design, implement, and maintain scalable data pipelines and infrastructure using Databricks, Redshift, and AWS services.
- Set up and manage Big Data environments, ensuring high availability and reliability of data processing systems.
- Develop and optimize ETL processes to transfer data between various sources, including S3, Redshift, and Databricks.
- Utilize AWS EMR for processing large datasets efficiently, leveraging Spark for distributed data processing.
- Implement monitoring solutions to track the performance and reliability of data pipelines and storage solutions.
- Use tools like Prometheus and Grafana to visualize metrics and identify bottlenecks in data workflows.
- Ensure data integrity and security across all platforms, implementing best practices for data access and management.
- Collaborate with data governance teams to establish policies for data quality and compliance.
- Work closely with software development teams to integrate data solutions into applications, ensuring minimal disruption and high performance.
- Provide insights on data architecture and best practices for leveraging data in applications.
- Respond to incidents related to data processing and storage, performing root cause analysis and implementing solutions to prevent recurrence.
- Facilitate blameless post-mortems to improve processes and systems continuously.
Who you are:
- Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
- 4-8 years of experience in Data, Site Reliability Engineering, or a related field with a focus on data engineering within AWS.
- Proficiency in Databricks and Redshift, with experience in data warehousing and analytics.
- Strong knowledge of AWS services, particularly S3, Athena, and EMR, for data storage and processing.
- Experience with programming languages such as Python or Scala for data manipulation and automation.
- Familiarity with SQL for querying databases and performing data transformations.
- Experience with distributed computing frameworks, particularly Apache Spark, for processing large datasets.
- Knowledge of data lake and data warehouse architectures, including the use of Delta Lake for managing data in Databricks.
- Proficiency in using tools like Terraform or AWS CloudFormation for provisioning and managing infrastructure.
- Familiarity with monitoring tools and practices to ensure system reliability and performance, including the use of AWS CloudWatch.
Tools and Technologies
- Data Platforms: Databricks, Amazon Redshift, AWS EMR, AWS S3, AWS Athena
- Big Data Frameworks: Apache Spark, Delta Lake
- Monitoring Tools: Prometheus, Grafana, AWS CloudWatch
- Infrastructure Management: Terraform, AWS CloudFormation
- Programming Languages: Python, Scala, SQL
Skills
Data Engineering
DevOps