UST
Website:
ust.com
Job details:
Role Description
Client Job Title: Infra L1 Support Engineer
UST Job Title: Lead I- Lab Support
Who We Are
At UST, we help the world s best organizations grow and succeed through transformation. Bringing together the right talent, tools, and ideas, we work with our client to co-create lasting change. Together, with over 30,000 employees in over 25 countries, we build for boundless impact touching billions of lives in the process. Visit us at UST.com.
The Opportunity
we re hiring an Infra Support Engineer to deliver L1/L2 technical support across GMI GPU clusters. Role Overview We are seeking a skilled Infra Support Engineer to join the GMI Global Infrastructure team. This role focuses on GPU system delivery, incident detection, triage, basic remediation, runbook execution, monitoring and clear escalation to the SRE (Site Reliability Engineering) team while helping improve operational runbooks and observability.
Responsibilities
- Provide first/second-line technical support to customers for the AI Infrastructure (GPU/CPU nodes, networking, storage, orchestration, platform services) via ticketing systems, emails, slack, or other messaging systems.
- Support GPU cluster delivery including system provisioning, image deployment, network validation, BIOS/firmware updates, GPU driver/runtime installation, etc.
- Monitor system health and service-level indicators (s, dashboards); respond to s 24x7 as scheduled.
- Triage incident, gather context, verify scope and impact, follow standard operating procedures and runbooks to perform immediate mitigations..
- Escalate to the global SRE engineers with clear, concise incident notes and relevant logs/traces.
- Maintain incident logs, update status pages, and communicate timely updates to stakeholders during incidents.
- Perform routine operational tasks: log checks, health checks, capacity checks, and simple automated fixes.
- Participate in postmortems and contribute actionable follow-ups to reduce recurrence.
- Help maintain and improve SOP, run periodic runbook validation, and document new procedures.
- Work collaboratively with developers and SRE teams to improve reliability.
Qualifications
- Bachelor s degree in Computer Science or related field.
- Over 2+ years of experience in IT operations, server administration, SRE, DevOps or technical support.
- Hands-on Linux experience (shell, kernel, logs).
- Basic networking knowledge (TCP/IP, DNS, HTTP, VLANs).
- Familiarity with monitoring/ing/logging tools (e.g. Prometheus, Grafana, Manager).
- Experience with Nvidia GPU infrastructure, Kubernetes,
- Comfortable collecting diagnostics, reading logs, and interpreting traces.
- Strong troubleshooting mindset and ability to follow runbooks under pressure.
- Excellent written and verbal communication for customer-facing incident handling.
- Willingness to work shifts and participate in on-call rotations.
- Bilingual in English and Chinese is highly preferred. Meeting every qualification is not required if you re excited about this role, we d love to hear from you. We believe diverse perspectives and experiences strengthen our team.
What We Believe
We re proud to embrace the same values that have shaped UST since the beginning. Since day one, we ve been building enduring relationships and a culture of integrity. And today, it's those same values that are inspiring us to encourage innovation from everyone, to champion diversity and inclusion and to place people at the centre of everything we do.
Humility
We will listen, learn, be empathetic and help selflessly in our interactions with everyone.
Humanity
Through business, we will better the lives of those less fortunate than ourselves.
Integrity
We honour our commitments and act with responsibility in all our relationships.
Equal Employment Opportunity Statement
UST is an Equal Opportunity Employer. We believe that no one should be discriminated against because of their differences, such as age, disability, ethnicity, gender, gender identity and expression, religion, or sexual orientation.
All employment decisions shall be made without regard to age, race, creed, colour, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law.
UST reserves the right to periodically redefine your roles and responsibilities based on the requirements of the organization and/or your performance.
- To support and promote the values of UST.
- Comply with all Company policies and procedures
Skills
technical support,linux,tcp/ip,dns,devops,grafana,server administration,
Click on Apply to know more.