Sr. Site Reliability Engineer-Remote USA
Aliso Viejo, California, United States
- Design and implement a comprehensive observability strategy using Datadog to provide a single pane of glass across development, operations, infrastructure, data, and database workloads
- Develop and maintain sophisticated alerting frameworks that minimize alert fatigue while ensuring critical issues are detected early
- Create and optimize SLIs, SLOs, and error budgets across services
- Implement automated remediation workflows for common failure scenarios
- Work with development teams to implement proper instrumentation, logging, and monitoring best practices
- Lead incident response, postmortem analyses, and implement systematic improvements
- Design and maintain dashboards that provide actionable insights for different stakeholder groups
- Automate toil reduction through infrastructure as code and monitoring as code practices
- Other duties as assigned
- 5+ years of hands-on SRE experience in largescale production environments
- Deep expertise with Datadog, including APM, Infrastructure Monitoring, Log Management, and Synthetic Monitoring
- Strong experience writing and optimizing monitoring as code using Terraform or similar tools
- Proficiency in at least one programming language (Python, Go, or Java preferred)
- Experience with modern observability practices including distributed tracing, metric aggregation, and log correlation
- Strong understanding of reliability engineering principles including SLIs, SLOs, error budgets, and toil reduction
- Experience with cloud platforms (AWS, Azure, or GCP) and containerized environments
- Knowledge of database systems and their monitoring requirements
- Understanding of network protocols and ability to troubleshoot network related issues
- Master's degree
- Experience with chaos engineering practices
- Knowledge of machine learning for anomaly detection
- Experience with high throughput, low latency systems
Ambry Genetics Corporation is a CAP-accredited and CLIA-licensed molecular genetics laboratory based in Aliso Viejo, California. We are a genetics-based healthcare company that is dedicated to open scientific exchange so we can work together to understand and treat all human disease faster.
At Ambry, everyone is welcome. A career at Ambry Genetics is a chance to be part of a dynamic company that aims to improve health by understanding the relationships between genetics and human disease. We earned our reputation as industry leaders by responsibly introducing cutting-edge genetic testing solutions and continually sharing what we learn with the global scientific community.
At Ambry you will be learning, challenging yourself, and having fun while collaborating with teammates through the open exchange of ideas. Our outstanding benefits program includes medical, dental, vision, 401k with a 4% employer match, FSA, paid sick leave and generous paid time off (PTO) program. You can learn more about the benefits here. Ambry Genetics is an Equal Opportunity Employer (EOE) and we maintain a drug-free work environment.
The Company believes in second chance employment. Qualified applicants with arrest or conviction history will be considered regardless of their arrest or conviction history, consistent with local laws such as Los Angeles County Fair Chance Ordinance and the California Fair Chance Act. You do not need to disclose your criminal history or participate in a background check until a conditional job offer is made to you. After making a conditional offer and running a background check, if the Company is concerned about conviction that is directly related to the job, you will be given the chance to explain the circumstances surrounding the conviction, provide mitigating evidence, or challenge the accuracy of the background report. For the purpose of the above job description, “Essential Functions” are “Material Job Duties”.
Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.
All qualified applicants will receive consideration for employment without regard to race (and traits historically associated with race, including, but not limited to hair texture and protective hairstyles such as braids, locks, and twists), color, creed, religion, sex, sexual orientation, gender identity, gender expression (including transgender status), national origin, ancestry, age, marital status or protected veteran status and will not be discriminated against on the basis of disability, protected medical condition as defined by applicable state or local law, genetic information, or any other characteristic protected by applicable federal, state, or local laws and ordinances. If you have a disability or special need that requires accommodation, please contact us at careers@ambrygen.com
Ambry does not accept unsolicited resumes from individual recruiters, third party recruiting agencies, outside recruiters or firms without an executed contract in place. We are not responsible for any fees related to resumes that are unsolicited or are received by Ambry. Such resumes will be deemed the sole property of Ambry and will be processed accordingly.
PRIVACY NOTICES
To review Ambry’s Privacy Notice, Click here: https://www.ambrygen.com/legal/privacy-policy
To review the California privacy notice, click here: California Privacy Notice | Ambry Genetics
To review the UKG privacy notice, click here: California Privacy Notice | UKG
#LI-REMOTE #LI-NK1 ApplyJob Profile
401k with 4% match Dental FSA Medical Medical, Dental, Vision Paid Sick Leave Paid Time Off Short-Term Incentive Vision
Tasks- Automate remediation workflows
- Develop alerting frameworks
- Ensure system reliability
- Implement SRE best practices
- Lead incident response
- Lead observability initiative
- Other duties as assigned
- Reduce operational toil
Alerting AWS Azure CAP Chaos Engineering CLIA Cloud platforms Dashboards Datadog GCP Genetics Go Healthcare Infrastructure as Code Java Machine Learning Molecular genetics Monitoring Monitoring as Code Organizational Programming Python Site Reliability Engineering Terraform Training
Experience5 years
EducationEducation Engineering Genetics Master's Master's degree
Certifications TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9