Report

Intern-Big Data

Min Experience

0 years

Location

Pune, IN

JobType

internship

About the job

Info This job is sourced from a job board

Overview

About the role

The BigData and Analytics group is a proactive, highly solutions oriented and collaborative team that works with all the various business groups across the organization. Our purpose is capturing massive amounts of data is to transform this vital information into concrete and valuable insights that will allow Seagate to make better and more strategic business decisions.If you're a Master of the "big picture", with demonstrated success in developing Big Data solutions where you're building massive Data Lakes, then we want to talk with you. Snap up on this opportunity for your next big career challenge to create business value as you showcase your strength in technical architecture for designing systems and leading projects. About the role - you will: • Part of 10-12 Platform Engineers that are the crux for developing and maintaining Big Data (Data Lake, Data Warehouse and Data Integration) and advanced analytics platforms at SeaGate • Apply your hands-on subject matter expertise in the Architecture of and administration of Big Data platforms - Data Warehouse Appliances , Open Data Lakes (AWS EMR, HortonWorks), Data Lake Technologies (AWS S3/Databricks/Other) • Develop and manage SPARK ETL Frameworks, Data orchestration with Airflow and support building PRESTO/Trino queries for the key stakeholders • Design, scale and deploy Machine Learning pipelines. • Collaborate with Application Architects and Business SMEs to design and develop end-to-end data pipelines and supporting infrastructure. • Establish and maintain productive relationships with peer organizations, partners, and software vendors About you: • Excellent coding skills in any language with deep desire to learn new skills and technologies. • You're a passionate professional who is up to the challenge of blending the fast-changing technology landscape of Big Data analytics with the complex and high-impact space of HiTech and Manufacturing analytics • As a motivated self-starter, you have the experience working in a dynamic environment • Exceptional data engineering skills in large, high-scale Data platforms and applications using cloud and big data technologies like Hadoop ecosystem and Spark • Strong appetite for constant learning, thinking out of the box, questioning the problems & solutions with the intent to understand and solve better • Excellent interpersonal skills to develop relationships with different teams and peers in the organization Your experience includes: • Big data processing frameworks knowledge: Spark, Hadoop, Hive, Kafka, EMR • Working in Big data solutions on cloud (AWS or Other) • Advanced experience and hands-on architecture and administration experience on big data platforms • Experience with Data Warehouse Appliances, Hadoop (AWS EMR), Data Lake Technologies (AWS S3/GCS/Other) and ML and Data Science platforms (Spark ML , H2O , KNIME ) • Working in Python, Java, Scala • DevOps, Continuous Delivery, and Agile development • Creating a culture of technical excellence by leading code and design reviews, promoting mentorship, and identifying and promoting educational opportunities for engineers • Understanding of Micro-services and container-based development using Docker and Kubernetes ecosystem is a BIG plus • Experience working in a Software Product Development environment is a BIG plus

About the company

With more than four decades of storage innovation, Seagate empowers humanity to thrive in the data age and helps people and businesses navigate the ever-expanding data landscape. We craft precision-engineered, cutting-edge solutions that help the world store and manage exponential data growth. Seagate is powered by our talented and passionate workforce of 29,000 employees across the globe who embody our core values: integrity, innovation, and inclusion. Striving towards excellence every single day, we show up with these values for our customers, business partners, shareholders, and communities alike.

Skills

spark

hadoop

hive

kafka

emr

aws

python

java

scala

devops

continuous-delivery

agile

docker

kubernetes