HTC Global Services
Website:
htcinc.com
Job details:
Job Description
About the Role:
We are looking for a NOC Engineer with at least 5 years to 8 years of relevant experience as a NOC Engineer.
Requirements
- Basic knowledge of network fundamentals (e.g., TCP/IP, DNS, VPN).
- Experience with network monitoring tools (e.g., SolarWinds, Nagios, PRTG, or similar).
- Strong attention to detail and ability to identify potential issues.
- Excellent communication skills in English, both written and verbal.
- Familiarity with incident and ticketing systems on ServiceNow
- Willingness to work in a 24/7 shift environment.
- Monitor network systems, servers, applications, and infrastructure using enterprise monitoring tools.
- Identify, acknowledge, and document alerts/events in the incident management system.
- Perform initial troubleshooting and resolution of network, system, and application alerts.
- Actively support Major Incident Management (MIM) by promptly identifying P1/P2 incidents and initiating the MIM process.
- Coordinate and join major incident bridge calls, ensuring all relevant technical teams are engaged for faster restoration.
- Provide timely incident communications and status updates to stakeholders, service owners, and business teams during outages.
- Track major incident progress, follow up with resolver groups, and ensure action items are executed within defined timelines.
- Maintain accurate incident timelines, impact details, and communication logs for audit and reporting purposes.
- Support preparation of Major Incident Reports (MIR) and contribute inputs for Root Cause Analysis (RCA).
- Ensure adherence to MIM governance, escalation matrix, and communication protocols.
- Identify recurring incidents and recommend preventive measures and monitoring improvements.
- Escalate unresolved or high-risk incidents to leadership and service management as required.
- Perform post-incident review coordination support and ensure closure activities are completed.
- Maintain accurate documentation, knowledge base updates, and detailed shift handovers.
- Ensure SLA adherence, response times, and operational stability across environments.
- Monitor network systems, servers, applications, and infrastructure for alerts and performance issues.
- Identify and acknowledge alerts, documenting events in the incident management system.
- Perform initial troubleshooting and incident resolution for basic network and system issues.
- Escalate complex issues to L2 support or relevant teams in a timely manner.
- Maintain accurate documentation and provide detailed shift handovers.
- Adhere to SLAs and response times to ensure network reliability.
Click on Apply to know more.