People Prime Worldwide
Website:
people-prime.com
Job details:
About Company:
Our client is a global technology consulting and digital solutions company that enables enterprises to reimagine business models and accelerate innovation through digital technologies. Powered by more than 84,000 entrepreneurial professionals across more than 30 countries, it caters to over 700 clients with its extensive domain and technology expertise to help drive superior competitive differentiation, customer experiences, and business outcomes.
Job Title: Pyspark Admin
Location: Hyderabad (Rai Durg) / / Pune (Shivajinagar)
Experience: 5 to 8 Years
Employment Type: Contract to Hire
Work Mode: Hybrid
Notice Period: Immediate Joiners Only
Job Description:
A PySpark Admin is typically a specialized Hadoop or Databricks Administrator responsible for managing monitoring and optimizing Apache Spark environments with a focus on PySpark applications and related infrastructure This role blends system administration skills with big data and programming knowledge
Core Responsibilities
A PySpark Admins responsibilities include both administrative and datafocused tasks
Cluster Management Installing configuring upgrading deploying and maintaining Apache Spark and HadoopDatabricks clusters to ensure high availability and optimal performance
Performance Tuning Optimization Monitoring troubleshooting and resolving performance bottlenecks in Spark jobs and data pipelines This includes optimizing query performance and resource utilization
Security Governance Implementing and enforcing best practices for data security access control eg IAM RBAC and data governance frameworks like Databricks Unity Catalog
Monitoring Support Setting up monitoring and logging tools like Splunk CloudWatch or Datadog to ensure system health and providing daytoday production support
Automation Integration Automating routine administrative tasks using scripting Python Bash Ansible and integrating Spark environments with other data platforms cloud services AWS Azure GCP and orchestration tools like Apache Airflow
Collaboration Working closely with data engineers data scientists and analysts to support data processing workflows and translate business requirements into scalable solutions
Capacity Planning Performing capacity planning and estimating hardwaresoftware requirements based on data storage and processing needs
Key Technical Skills and Qualifications
Successful PySpark Admins possess a blend of strong technical skills
Proficiency in PySparkPython Strong programming skills in Python and its Spark API are essential for automation development support and troubleshooting
Big Data Technologies Expertise in the Hadoop ecosystem HDFS YARN Hive Kafka or Databricks platform is required
Cloud Platform Experience Handson experience with cloud services AWS Azure GCP that host big data platforms eg AWS Glue EMR S3
System Administration Strong LinuxUnix system administration skills and experience with containerization technologies like Docker and Kubernetes
SQL Knowledge Proficiency in SQL for data analysis validation and performance tuning is a must
ProblemSolving Excellent analytical and problemsolving skills for debugging complex issues in production environments
Skills
Mandatory Skills : MySQL, Oracle Linux Administration, Python
Click on Apply to know more.