TEKWISSEN
Website:
tekwissen.com
Job details:
Overview
TekWissen is a global workforce management provider throughout India and many other countries in the world. The below client is a global company with shared ideals and a deep sense of family. From our earliest days as a pioneer of modern transportation, we have sought to make the world a better place – one that benefits lives, communities and the planet
Job Title: Platform Engineering Senior Engineer
Location: Chennai
Work Type: Hybrid
Position Description
- We are seeking a highly skilled and motivated HPC CAE Support Engineer to join our team.
- You will be responsible for integrating, profiling, supporting and maintaining HPC CAE applications and user-facing tooling, ensuring optimal performance and reliability for our critical Supercomputing HPC Platform offering.
- This role also has a large focus on CAE applications support, integration, and interacting with consumers of the platform.
- If you are interested to engage with every part of the HPC stack ranging from the servers to the product development engineer using them, this position could be a good fit for you.
Responsibilities
- nstall, integrate, optimize, and support CAE applications and workloads across in a HPC environment with advanced CPU, GPU and Interconnects technologies.
- Support CLI tooling and API’s that customers consume to streamline access to HPC infrastructure.
- Troubleshoot and resolve complex technical issues related to Linux systems, networking, storage, and CAE HPC applications. Develop and maintain documentation for software and procedures.
- Collaborate with software engineers and researchers to ensure seamless integration of HPC resources and scaling of applications.
- Stay up-to-date on the latest advancements in HPC and AI/ML technologies and best practices.
- Qualifications Bachelor's degree in Computer Science, Engineering, or a related field.
- 3-5+ years of experience in CAE, Systems or Software engineering Strong familiarity with CAE or scientific computing Strong understanding of Linux operating systems, preferably in an HPC environment Proficiency programming in one or more languages, preferably python, go or bash scripting.
- Familiarity with how to scale applications and the metrics collection, analysis, and visualization tools used to identify bottlenecks like Prometheus and Grafana.
- Excellent problem-solving and troubleshooting skills.
- The ability to define what problems need to be solved.
- Strong communication and collaboration skills.
- Even better, you may have... Experience with containerization technologies like Docker or Kubernetes.
- Experience with monitoring tools like Prometheus, Icinga, Zabbix, Nagios, or similar.
- You may not check every box, or your experience may look a little different from what we've outlined, but if you think you can bring value to the Client's Company, we encourage you to apply!
Skills Required
- Linux, Linux - Clusters, Technical Troubleshoot, Troubleshooting (Problem Solving)
Experience Required
- Experience in LinuxOS and proficient in adminstration of LinuxOS
- Proficient in Shell Scripting (Primary) and Python (desirable)
- Experience and familiarity in containerization technologies like Docker or Kubernetes.
Experience Preferred
- Experience and familiarity in monitoring tools e.g., Prometheus, Zabbix, Nagios, or similar.
- Familiarity with CAE, scientific and computational apps
Education Required
TekWissen® Group is an equal opportunity employer supporting workforce diversity.
Click on Apply to know more.