Flag job

Report

HPC Systems Engineer

Min Experience

5 years

Location

Bangalore, KA, India

JobType

full-time

About the role

Invent the future with us. Recognized by Fast Company's 2023 100 Best Workplaces for Innovators List, Ampere is a semiconductor design company for a new era, leading the future of computing with an innovative approach to CPU design focused on high-performance, energy efficient, sustainable cloud computing. By providing a new level of predictable performance, efficiency, and sustainability Ampere is working with leading cloud suppliers and a growing partner ecosystem to deliver cloud instances, servers and embedded/edge products that can handle the compute demands of today and tomorrow. Join us at Ampere and work alongside a passionate and growing team — we'd love to have you apply! About the role: Our world class Engineering Compute team is seeking a HPC Systems Engineer responsible for scaling Ampere's high-performance compute needs to best serve innovative engineering projects within the company. What you'll achieve: Explore emerging technologies and technical developments to address expanding compute, networking, and memory requirements. Implement industry standard orchestration tools like Terraform to automate everything. Work with internal teams and SW/EDA vendors to implement best practices and identify inefficiencies and bottlenecks. Drive capacity planning, negotiations, and purchase of engineering compute licensing, machines, storage, and network Design, build, and deploy highly scalable compute environments to allow for infinite growth. Modify and implement job flows that balance resources in both cloud and on prem. Identify and implement monitoring solutions that effectively show utilization of computer sources and licenses, to best serve different engineering projects within the company Create data driven forecasting to track development cycles/projects. About you: BS degree in Computer Science, Computer Engineering, or related technical field with 5+ years of experience Expertise in one of PERL / Python / Bash, or other scripting languages scripting and automating Infrastructure needs in large scale Data Centre environment. Expertise in Configuration Management with working knowledge in one of tools like CFEngine/Chef/Puppet along demonstrated work experience in Ansible. Expertise in setting up Prometheus environment and developing monitoring solutions with Prometheus. Linux systems administration experience with proven experience in debugging issues with RHEL, Load Sharing Facility (LSF) and Storage. Familiarity with Git tools like Github and Gitlab. Familiarity with ASIC design flow, EDA, design simulations with large compute jobs across multiple machines and cores, would be a plus. Planning and deploying additional compute and storage in cloud and on-prem data centres Ability to drive technical leadership and management of complex large-scale HPC system projects. Effective and skilled at communication/collaborating with multiple internal groups and business units located around the world. Strong verbal and written communication skills (English). Ability to multitask effectively in a dynamic environment. Ability to manage without authority.

About the company

Invent the future with us. Recognized by Fast Company's 2023 100 Best Workplaces for Innovators List, Ampere is a semiconductor design company for a new era, leading the future of computing with an innovative approach to CPU design focused on high-performance, energy efficient, sustainable cloud computing. By providing a new level of predictable performance, efficiency, and sustainability Ampere is working with leading cloud suppliers and a growing partner ecosystem to deliver cloud instances, servers and embedded/edge products that can handle the compute demands of today and tomorrow.

Skills

perl
python
bash
terraform
cfengine
chef
puppet
ansible
prometheus
rhel
lsf
git
linux