FreshRemote.Work

Director of Site Reliability Engineering

Remote, Washington, USA

Veeam®, the #1 global market leader in data protection and ransomware recovery, is on a mission to empower every organization to not just bounce back from a data outage or loss but bounce forward.

With Veeam, organizations achieve radical resilience through data security, data recovery, and data freedom for their hybrid cloud. 

The Veeam Data Platform delivers a single solution for cloud, virtual, physical, SaaS, and Kubernetes environments that gives IT and security leaders peace of mind that their apps 
and data are protected and always available.

Headquartered in Seattle with offices in more than 30 countries, Veeam protects over 450,000 customers worldwide, including 74% of the Global 2000, who trust Veeam to keep their businesses running.

 

As Director of Site Reliability Engineering at Veeam, you will lead a global team of SREs working on the Veeam Data Cloud, the world’s most successful, modern, data protection platform. You must be an excellent technical and people leader who will own and drive engineering excellence. You will be responsible for team and individual development as well as mentoring the next level of engineering leadership within the company. Reporting to the VP of Engineering, you will have a high level of autonomy but be held accountable for delivering business results.  

Your tasks will include: 

  • Define and drive SRE strategy: Establish and implement a vision for reliability, availability, and operational excellence across all VDC systems.
  • Lead incident and change management: Manage and improve processes to improve incident response, root cause analysis, and change control, ensuring every change is tracked and measured.
  • Drive organization wide operational excellence: Act as a thought leader and change agent to drive proactive failure analysis, chaos engineering, and incident reviews to continuously improve system reliability.
  • Enable engineering teams: Collaborate with engineering teams and develop processes and tooling that empower those teams to effectively operate their applications.
  • Support On-Call culture: Define best practices for on-call rotations, incident response, and escalation policies. The SRE team will help set the standard for operational excellence, fill gaps in on-call coverage, and act as first responders when necessary to ensure critical issues are addressed swiftly.
  • Build and lead a high-performing team: Hire, mentor, and manage a global SRE team focused on automation, operational maturity, and platform reliability.
  • Develop and Track Reliability Metrics: Define and monitor SLOs, SLIs, and error budgets to align reliability efforts with business needs.

What we expect from you: 

  • 5+ years of experience leading SRE teams operating high-scale, cloud-native SaaS products.
  • 7+ years of hands-on SRE experience in fast-paced, high-growth software companies.
  • Proven experience building and scaling on-call rotations, improving incident management processes, and establishing operational best practices.
  • Deep expertise in public cloud infrastructure, ideally Azure.
  • Strong understanding of Kubernetes, Infrastructure as Code (IaC), and modern observability practices (e.g., distributed tracing, metrics, and logging).
  • Experience implementing secure development practices, CI/CD pipelines, and operational processes in compliance-focused environments
  • Demonstrated success managing cross-functional teams and collaborating with engineering, support, security, and other stakeholders
  • Experience presenting to executives in high-pressure situations.
  • Experience managing vendor relationships and external partnerships
  • Bachelor’s degree in Computer Science, Information Security, or a related field (Master’s degree preferred)

We offer: 

  • Unlimited PTO
  • Medical, dental, and vision benefits that start on day one
  • Flexible spending accounts
  • Life insurance and short-term and long-term disability coverage
  • Family planning support benefits, along with 100% paid maternity and parental leave
  • 401k match
  • Veeam Care Days – additional 24 hours for your volunteering activities
  • Professional training and education, including courses and workshops, internal meetups, and unlimited access to our online learning platforms (Percipio, Athena, O’Reilly) and mentoring through our MentorLab program.

The salary range posted is On Target Earnings (OTE), which is inclusive of base and variable pay. When making an offer of employment, Veeam will take into consideration the candidate’s expectations, experience, education, scope of responsibility for the role, and the current market demands.

United States of America Pay Range $239,600—$342,300 USD   Please note: If the applicant is permanently located outside the United States of America, Veeam reserves the right to decline the application. #LI-IU1

Veeam Software is an equal opportunity employer and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential.

Please note that any personal data collected from you during the recruitment process will be processed in accordance with our Recruiting Privacy Notice.  

The Privacy Notice sets out the basis on which the personal data collected from you, or that you provide to us, will be processed by us in connection with our recruitment processes. 

By applying for this position, you consent to the processing of your personal data in accordance with our Recruiting Privacy Notice.

Apply