Report

Big Data Developer

Location

Kochi, Kerala, India

JobType

full-time

About the job

Info This job is sourced from a job board

Overview

About the role

Purple Austyn technologies

Website: purpleaustyn.com
Job details:

Role: Big Data Engineer

Exp: 4 - 8 Years

Mode of Work: On-site

Employment Type: Full-time

Important: Hands-on experience in designing, building, and tuning scalable big data processing pipelines, including data ingestion, transformation, and workflow orchestration. Strong expertise in big data architectures, cloud-based data solutions, and end-to-end pipeline design and documentation. Proficient in Python, Java, and Scala, with extensive experience using big data frameworks such as Apache Hadoop, Spark, Kafka, and Flink for batch and real-time data processing

Job Description

 Design, develop, and maintain scalable and efficient big data processing pipelines distributed computing systems

 Collaborate with cross-functional teams to understand data requirements and design appropriate data solutions.

 Implement data ingestion, processing, and transformation processes to support various analytical and machine learning use cases.

 Optimize and tune data pipelines for performance, scalability, and reliability.

 Monitor and troubleshoot pipeline performance issues, identifying and resolving bottlenecks.

 Ensure data quality and integrity throughout the pipeline, implementing data validation and error handling mechanisms.

 Stay updated on emerging technologies and best practices in big data processing and analytics, incorporating them into our data engineering practices.

 Document design decisions, technical specifications, and data workflows.

Mandatory Qualification:

 Bachelor's degree in computer science, Engineering, or related field. Master's degree preferred.

 Proven experience (3+ years) as a Data Engineer or similar role, with a focus on big data processing.

 Strong proficiency in programming languages such as Python, Java, & Scala.

 Experience with big data processing frameworks and technologies such as Apache Hadoop, Spark, Kafka, Flink, etc.

 Hands-on experience with distributed computing, parallel processing, and cloud-based data platforms (e.g., AWS, Azure, Google Cloud).

 Proficiency in SQL and database technologies (e.g., SQL Server, PostgreSQL, MySQL).

 Experience with data warehousing, ETL (Extract, Transform, Load) processes, and data modeling.

 Familiarity with containerization technologies (e.g., Docker, Kubernetes) and DevOps practices.

 Excellent problem-solving skills and attention to detail.

 Strong communication and collaboration skills, with the ability to work effectively in a team environment.

Tools and Technologies:  IDEs: IntelliJ IDEA, Eclipse  Build Tools: Maven, Gradle  Testing Frameworks: JUnit, Mockito, TestNG  Containerization: Docker, Kubernetes  API Documentation: Swagger, OpenAPI  Monitoring and Logging: Prometheus, Grafana, ELK Stack  Database: MySQL, PostgreSQL, MongoDB, Redis  ORM Frameworks: Hibernate, Spring Data  Message Brokers: Kafka

Click on Apply to know more.

Skills

Python

AWS

Apache

API

Azure

containerization

cross-functional

data engineer

data ingestion

data modeling

data solutions

database

DevOps

Docker

end-to-end

ETL

Flink

Google Cloud

Gradle

Hadoop

Hibernate

Java

JUnit

Kafka

Kubernetes

machine learning

Maven

Mockito

MySQL

PostgreSQL

Redis

Spark

SQL

Swagger