Purple Austyn technologies
Report
Location
Kochi, Kerala, India
JobType
full-time
About the job
This job is sourced from a job board
Role: Big Data Engineer
Exp: 4 - 8 Years
Mode of Work: On-site
Employment Type: Full-time
Important: Hands-on experience in designing, building, and tuning scalable big data processing pipelines, including data ingestion, transformation, and workflow orchestration. Strong expertise in big data architectures, cloud-based data solutions, and end-to-end pipeline design and documentation. Proficient in Python, Java, and Scala, with extensive experience using big data frameworks such as Apache Hadoop, Spark, Kafka, and Flink for batch and real-time data processing
Job Description
Design, develop, and maintain scalable and efficient big data processing pipelines distributed computing systems
Collaborate with cross-functional teams to understand data requirements and design appropriate data solutions.
Implement data ingestion, processing, and transformation processes to support various analytical and machine learning use cases.
Optimize and tune data pipelines for performance, scalability, and reliability.
Monitor and troubleshoot pipeline performance issues, identifying and resolving bottlenecks.
Ensure data quality and integrity throughout the pipeline, implementing data validation and error handling mechanisms.
Stay updated on emerging technologies and best practices in big data processing and analytics, incorporating them into our data engineering practices.
Document design decisions, technical specifications, and data workflows.
Mandatory Qualification:
Bachelor's degree in computer science, Engineering, or related field. Master's degree preferred.
Proven experience (3+ years) as a Data Engineer or similar role, with a focus on big data processing.
Strong proficiency in programming languages such as Python, Java, & Scala.
Experience with big data processing frameworks and technologies such as Apache Hadoop, Spark, Kafka, Flink, etc.
Hands-on experience with distributed computing, parallel processing, and cloud-based data platforms (e.g., AWS, Azure, Google Cloud).
Proficiency in SQL and database technologies (e.g., SQL Server, PostgreSQL, MySQL).
Experience with data warehousing, ETL (Extract, Transform, Load) processes, and data modeling.
Familiarity with containerization technologies (e.g., Docker, Kubernetes) and DevOps practices.
Excellent problem-solving skills and attention to detail.
Strong communication and collaboration skills, with the ability to work effectively in a team environment.
Tools and Technologies: IDEs: IntelliJ IDEA, Eclipse Build Tools: Maven, Gradle Testing Frameworks: JUnit, Mockito, TestNG Containerization: Docker, Kubernetes API Documentation: Swagger, OpenAPI Monitoring and Logging: Prometheus, Grafana, ELK Stack Database: MySQL, PostgreSQL, MongoDB, Redis ORM Frameworks: Hibernate, Spring Data Message Brokers: Kafka