Staff Site Reliability Engineer
Remote
About Ancestry:
When you join Ancestry, you join a human-centered company where every person’s story is important. Ancestry®, the global leader in family history, empowers journeys of personal discovery to enrich lives. With our unparalleled collection of more than 40 billion records, over 3 million subscribers and over 23 million people in our growing DNA network, customers can discover their family story and gain a new level of understanding about their lives. Over the past 40 years, we’ve built trusted relationships with millions of people who have chosen us as the platform for discovering, preserving and sharing the most important information about themselves and their families.
We are committed to our location flexible work approach, allowing you to choose to work in the nearest office, from your home, or a hybrid of both (subject to location restrictions and roles that are required to be in the office- see the full list of eligible US locations HERE). We will continue to hire and promote beyond the boundaries of our office locations, to enable broadened possibilities for employee diversity.
Together, we work every day to foster a work environment that's inclusive as well as diverse, and where our people can be themselves. Every idea and perspective is valued so that our products and services reflect the global and diverse clients we serve.
Ancestry encourages applications from minorities, women, the disabled, protected veterans and all other qualified applicants. Passionate about dedicating your work to enriching people’s lives? Join the curious.
As a Staff Site Reliability Engineer (SRE) at Ancestry, you will play a critical role in enhancing the reliability, performance, and scalability of our services. Reporting to our Principal Software Engineering Manager, you will collaborate closely with our engineering teams to design, build, and instrument our web applications and systems infrastructure, with a strong focus on automation, availability, and performance. A deep understanding of system administration is essential, and specific experience with both Linux and Windows environments is required.
What you will do…
Own site reliability for a product vertical in collaboration with engineering
Define and Ensure SLO / SLI and Error budgets remain in compliance with standards
Develop improved monitoring, auto scaling and resiliency patterns and capabilities.
Debug complex issues across multiple services in AWS, to include outfacing infrastructure
Collaborate and Develop cloud automation and new best practices in support of vertical and organization
Train , mentoring and support …
This job isn't fresh anymore!
Search Fresh JobsJob Profile
Benefits/PerksDiversity initiatives Inclusive work environment Location-flexibility
Tasks- Collaborate with engineering
AWS Bash CI/CD Cloud Automation CloudFormation Cloud networking Database technologies Fault tolerance Go Java Linux Monitoring New Relic Node.js Prometheus Python Resilience Site Reliability Engineering Software Engineering Terraform Windows
Experience7 years