FreshRemote.Work

Intermediate Site Reliability Engineer, Gitaly:Cluster

Remote

GitLab is an open core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Our mission is to enable everyone to contribute to and co-create the software that powers our world. When everyone can contribute, consumers become contributors, significantly accelerating the rate of human progress. This mission is integral to our culture, influencing how we hire, build products, and lead our industry. We make this possible at GitLab by running our operations on our product and staying aligned with our values. Learn more about Life at GitLab.

An overview of this role

The GitLab DevSecOps platform empowers 100,000+ organizations to deliver software faster and more efficiently. We are one of the world’s largest all-remote companies with 2,000+ team members and values that foster a culture where people embrace the belief that everyone can contribute. Learn more about Life at GitLab.

SREs with Gitaly work alongside Backend Engineers with a focus primarily on improving the availability and the reliability of the Gitaly fleet on GitLab.com. While the backend engineers approach their responsibilities from a software developer point of view, the SREs approach the same problems from the operational perspective and collaborate closely on finding an optimal solution, in addition to ensuring that new Gitaly features can run at scale and deployed to production safely.

Gitaly is the Git data storage tier of GitLab, providing a reliable, secure and fast distributed Git data store over gRPC. For more information about Gitaly, see the team’s Direction page. 

Gitaly’s high-availability storage requires developers who understand distributed storage systems, their management, observability and availability. Cluster team contributes features, fixes bugs and improves performance of this software stack.

Currently, we're building a new distributed cluster solution and improvements to our Disaster Recovery readiness.

What you’ll do  

  • Work with peer SREs to maintain Gitaly’s environments within GitLab’s SaaS offerings, including cost and performance optimization, capacity planning, migrations and debugging production issues.
  • Participate in architectural discussions and decisions surrounding Gitaly, within the greater GitLab ecosystem.
  • Design RPC interfaces for the Gitaly service.
  • Scope, estimate and describe tasks to reach the team’s goals.
  • Develop production automation and tooling for Gitaly, for use both in SaaS and self-managed installations.
  • Help ensure that Gitaly development tooling, releases and other processes serve the team and the product’s goals.
  • …
This job isn't fresh anymore!
Search Fresh Jobs