Director, Site Reliability Engineering
US - Distributed
Menlo Security's mission is enabling the world to connect, communicate and collaborate securely without compromise. COVID-19 has made our mission all the more real. We support customers across various enterprises including Fortune 500 companies, 9/10 of the largest global banks and the Department of Defense.
The world has fundamentally changed. We are growing from 400 employees into the next phase of our journey, and we need passionate talent filled with empathy and agility. The right candidate for the job is ethical, hyper-organized, fanatical about seeing things through to completion, service-oriented, and humble enough to take feedback and coaching yet confident enough to provide feedback and coaching.
Menlo is well-funded for growth and our investors are second to none. They include Vista Equity Partners (“Vista”), General Catalyst, JPMC, American Express, HSBC, and Ericsson Ventures.
About the Role
The Tech Ops/SRE team ensures the reliability, scalability, and performance of our critical systems and infrastructure. We expect failure, build security in by design, create evolvable systems, and enable multi-tenancy across the infrastructure. Automation is an absolute for us.
We are committed to getting it done properly, the first time.
As the Director of Technical Operations, you'll manage a team of globally distributed engineers and engineering leaders across EMEA, Canada, and the US. This team is responsible for managing the company's core infrastructure services and maintaining our constantly growing platform. The team deploys and operates the Menlo Security product and infrastructure. Success in this role requires strong technical leadership skills; a broad background and understanding of systems, networks, and applications; and excellent people leadership. You will focus on streamlining processes, increasing the team's autonomy, and driving critical cross-functional projects with a variety of stakeholders. The ideal candidate will have a proven track record of building and leading high-performing SRE teams, driving a culture of continuous improvement, and implementing effective processes and tools to achieve operational excellence.
Responsibilities
Leadership and Strategy:
Provide strategic direction and vision for the SRE team, aligning with overall company goals and objectives.
Build and mentor a high-performing SRE team, fostering a collaborative and innovative culture.
Develop and implement SRE best practices, processes, and tools to improve system reliability and performance.
Define and track key performance indicators (KPIs) to measure and report on SRE effectiveness.
Partner with other engineering teams to ensure seamless integration and collaboration.
System Reliability and Performance:
Oversee the design, implementation, and maintenance of highly reliable …
This job isn't fresh anymore!
Search Fresh JobsJob Profile
- Collaborate with development teams
- Conduct root cause analysis
- Implement best practices
- Provide strategic direction
Application Performance Automation AWS Azure Capacity planning Collaboration Communication GCP Incident Response Leadership Network management Performance Optimization Problem-solving Process Improvement Root Cause Analysis Site Reliability Engineering Systems Design Technical Leadership
Experience5 years
Education TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9