FreshRemote.Work

Staff Site Reliability Engineer - Incident Response

Remote - Washington, USA

About Zscaler

Serving thousands of enterprise customers around the world including 40% of Fortune 500 companies, Zscaler (NASDAQ: ZS) was founded in 2007 with a mission to make the cloud a safe place to do business and a more enjoyable experience for enterprise users. As the operator of the world’s largest security cloud, Zscaler accelerates digital transformation so enterprises can be more agile, efficient, resilient, and secure. The pioneering, AI-powered Zscaler Zero Trust Exchange™ platform protects thousands of enterprise customers from cyberattacks and data loss by securely connecting users, devices, and applications in any location. 

Named a Best Workplace in Technology by Fortune and others, Zscaler fosters an inclusive and supportive culture that is home to some of the brightest minds in the industry. If you thrive in an environment that is fast-paced and collaborative, and you are passionate about building and innovating for the greater good, come make your next move with Zscaler. 

Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your vision and passion to our team of cloud architects, software engineers, security experts, and more who are enabling organizations worldwide to harness speed and agility with a cloud-first strategy.

NOTE: U.S. citizenship is required for this position due to the nature of the customers assigned to this role

We're looking for an experienced Staff Site Reliability Engineer-Incident Response to join our Shared Platform Engineer team. Reporting to the Director Cloud Operations and Incident Management, you'll be responsible for:

  • Lead and advocate for the transformation to a world-leading SRE organization, promoting SRE principles within the Engineering Department.
  • Provide expert leadership during critical outages, coordinating multiple teams to ensure streamlined decision-making and quick resolution.
  • Promote a customer-focused approach by addressing and mitigating global customer environment issues, and fostering a culture of continuous learning and technical excellence within the SRE team.
  • Develop and implement scalable process frameworks and observability strategies to ensure rapid problem diagnosis, response, and service reliability.
  • Collaborate with product teams to thoroughly analyze failures and integrate insights to improve service reliability, scalability, and operational efficiency.

What …

This job isn't fresh anymore!
Search Fresh Jobs