FreshRemote.Work

Staff Site Reliability Engineer, Infrastructure, Observability

US Remote

We're Cruise, a self-driving service designed for the cities we love.

We’re building the world’s most advanced self-driving vehicles to safely connect people to the places, things, and experiences they care about. We believe self-driving vehicles will help save lives, reshape cities, give back time in transit, and restore freedom of movement for many.

In our cars, you’re free to be yourself. It’s the same here at Cruise. We’re creating a culture that values the experiences and contributions of all of the unique individuals who collectively make up Cruise, so that every employee can do their best work. 

Cruise is committed to building a diverse, equitable, and inclusive environment, both in our workplace and in our products. If you are looking to play a part in making a positive impact in the world by advancing the revolutionary work of self-driving cars, come join us. Even if you might not meet every requirement, we strongly encourage you to apply. You might just be the right candidate for us.

The Observability team at Cruise is looking for a Staff Site Reliability Engineer to play a critical role in building out and improving observability systems, tools and the related codebase.

Site Reliability Engineers at Cruise bring specialized knowledge and experience to ensure the reliability, scalability, performance, efficiency, and security of our systems.

 

What you'll be doing:

  • Using your software and systems engineering skills to contribute code, perform code reviews, and create technical designs that improve performance and reliability of observability systems.
  • Proactively identify and address challenges that create new opportunities to improve the state of engineering through observability.
  • Partnering with Software Engineering teams to better understand use-cases and guide the engineers to use the existing tools effectively.
  • Building tools to enable engineers to collect and act on observability signals.

 

What you must have:

  • Previous experience as an SRE, Production Engineer, Systems Engineer, or Software Engineer with a focus on distributed systems reliability.
  • Considerable experience in working with container orchestration systems (eg. Kubernetes).
  • Proficient in designing and developing sophisticated distributed systems, with expertise in one or more high-level programming languages such as Go, Python, Rust, C/C++, or NodeJS.
  • Experience in implementing a new technology or service by leading or driving a multi-functional effort.
  • Experience in designing and implementing large scale systems.
  • Considerable …
This job isn't fresh anymore!
Search Fresh Jobs