Report

Sr Data Engineer- Spark, Hive, SQL, AWS

Min Experience

5 years

Location

Gurgaon, Haryana

JobType

full-time

About the job

Info This job is sourced from a job board

Overview

About the role

About the role Gartner is looking for a Data Engineer focusing on data engineering. This person will be a part of team developing and supporting Gartner 's client facing experience. This includes supporting all the data related operations within the team. This position will provide an opportunity to work with on several different technologies focusing on data modelling , data services and data science. This includes building and consuming web services, integrating search technologies, and building personalization , recommendation engines, data engineering best practices integration. What You Will do Participate in architecture design and implementation of high-performance, scalable, and optimized data solutions. Prepare documentations and specifications Good Sql understanding with ability to create the data model from scratch. Help write and optimize in-application SQL statements. Ensure performance, security, and availability of databases. Handle common database procedures such as upgrade, backup, recovery, migration, etc. Design, build and automate the deployment of data pipelines and applications to support data scientists and researchers with their reporting and data requirements. Integrate data from a wide variety of sources, including on premise databases and external data sources with rest APIs and harvesting tools. Collaborate with internal business units and data science teams on business requirements, data access, processing/transformation and reporting needs and leverage existing and new tools to provide solutions. Work with team on managing AWS resources (EMR, ECS clusters, etc.) and continuously improve deployment process of our applications Promote the integration of new cloud technologies and continuously evaluate new tools that will improve the organization's capabilities while leading to lower total cost of operation. Support automation efforts across the data analytics team utilizing Infrastructure as Code (IaC) using Terraform, Configuration Management, and Continuous Integration (CI) / Continuous Delivery (CD) tools such as Jenkins. Work with the team to implement data governance, access control and identify and reduce security risks. What you will need Need 5-6 years of experience in Solution, Design and Development of Cloud based data models, ETL Pipelines and infrastructure for reporting, analytics, and data science. Must have Experience working with Spark, Hive, HDFS, MR, Apache Kafka/AWS Kinesis Experience working with both structured and unstructured data. Strong proficiency with SQL and its variation among popular databases Capable of troubleshooting common database issues. Experience with version control tools (Git, Subversion) Experience using automated build systems (CI/CD) Experience of Data Structures and algorithms Knowledge of different databases technologies (Relational, NoSQL, Graph, Document, Key-Value, Time Series, etc…). This should include building and managing scalable data models. Knowledge of Cloud based platforms (AWS) must have.

About the company

At Gartner, Inc. (NYSE:IT), we guide the leaders who shape the world. Our mission relies on expert analysis and bold ideas to deliver actionable, objective insight, helping enterprise leaders and their teams succeed with their mission-critical priorities. Since our founding in 1979, we've grown to more than 20,000 associates globally who support ~15,000 client enterprises in ~90 countries and territories. We do important, interesting and substantive work that matters. That's why we hire associates with the intellectual curiosity, energy and drive to want to make a difference. The bar is unapologetically high. So is the impact you can have here.

Skills

spark

hive

sql

aws

kafka

hdfs

git

subversion

ci/cd

data structures

algorithms

nosql

graph

document

key-value

time series