Site Reliability Engineer - Networking Support - US, AZ, Remote
NVIDIA is looking for a Site Reliability Engineer (SRE) to join its Networking Support team. As an SRE at NVIDIA you will ensure that our customers production environments have reliability and uptime. We are seeking an SRE with a mentality and methodology of how maintain, monitor and troubleshoot DC networking equipment.
SRE's culture of diversity, intellectual curiosity, problem solving and openness is important to our success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to build an environment that provides the support and mentorship needed to learn and grow.
What you will be doing:
Supervise equipment, applications and processes through various tools applications and consoles
Rapidly debug and triage incidents and user-reported issues
Work with Tier 2 and Tier 3 support as required
Make valuable contribution to the overall health, performance, and reliability of the networking equipment and Infrastructure Services
Develop documentation for Operations processes
Work rotating shifts, including weekends and holidays; and overtime as required
What we need to see:
BS or diploma the Information Technology field, or equivalent experience
4+ years Site reliability engineering experience working on large scale distributed micro services in a production environment with a real passion for automation and tooling
Must be able to operate network devices and pull cables over the racks in a data center environment
Physical labor to Rack/Unrack network equipment in data center
An expertise with Incident management, organizational change and problem management process. Ability to detection of all service-impacting issues, accurate triage, partner communication, impact containment, service …
Hey, this job isn't fresh anymore!
Search Fresh JobsJob Profile
Regions Countries Benefits/Perks SkillsAutomation Communication Incident Management Linux Linux Operating System Networking Site Reliability Engineering TCP/IP TCP/IP Networks
Experience4+ years
Education TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9