FreshRemote.Work

Staff SRE Engineer

North America - Remote

About Invisible

Invisible Technologies is the AI training and scaling partner for the leading foundation model providers, enterprises, and governments, bridging the gap between AI potential and production. Invisible’s unique AI Process Platform combines elite global human expertise, cutting edge technology, and deep institutional knowledge gained by training 80% of the world’s leading AI models. Trusted by AWS, Microsoft, and Cohere, we have an unparalleled ability to operationalize AI for real-world applications. Our explosive growth landed us the #3 spot on the Inc. 5000 in 2024, closing the year on $134m revenue.

About The Role

We are always striving to build the right thing. You are a key partner for the Engineering and Product teams. You will focus your energy on driving reliability and automation for our products. The ideal candidate has learned from experience that technical decisions have far-reaching consequences. As an experienced professional engineer, you are always mindful to avoid technical debt and waste.

 

What You’ll Do

  • Ensure the availability, performance, and scalability of production systems
  • Deploy, configure, automate, and manage cloud-based infrastructure using tools like Kubernetes, Terraform, and Argo
  • Identify and resolve system bottlenecks, optimizing for performance and cost efficiency across engineering teams
  • Design, support, and manage deployment pipelines to enable world class delivery of applications
  • Design, develop, and maintain comprehensive monitoring and observability systems using Datadog and Sentry
  • Define Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure reliability and performance
  • Design and implement automated solutions to reduce manual operational tasks
  • Build tools for system provisioning, monitoring, deployment, and scaling 
  • Collaborate closely within engineering teams to improve application reliability, resilience, and maturity 

What We Need

  • Strong understanding of cloud architecture including expertise with major cloud providers (GCP, AWS, Azure)
  • Proficiency in a programming language and ability to write production code beyond just scripting
  • Understand underlying networking and security considerations when developing the architecture of our deployment environments
  • Strong understanding of Relational Databases (PostgreSQL) and be comfortable optimizing and advising the broader engineering team on optimization techniques to ensure the data layer of our deployed services run smoothly
  • Strong understanding of authentication and authorization principles such as IAM, Security Groups, RBAC, etc.
  • Understanding of software engineering fundamentals, practices, and patterns with distributed cloud services
  • Strong experience with production systems troubleshooting …
This job isn't fresh anymore!
Search Fresh Jobs

Job Profile

Regions

North America

Restrictions

Location-based eligibility requirements Remote

Benefits/Perks

Annual profit reinvestment Co-ownership Exceptional benefits Liquidity for shares Partner ownership Remote-first company Remote work Transparent pay model

Tasks
  • Automate operational tasks
  • Collaborate with engineering teams
  • Define SLOs and SLIs
  • Deploy and manage cloud infrastructure
  • Design deployment pipelines
  • Ensure system availability
  • Maintain monitoring systems
  • Optimize system performance
Skills

AI AI models AI Training Argo Authentication Automation AWS Azure Cloud Architecture CloudFormation Datadog Design Distributed cloud services Engineering GCP IAM Infrastructure Infrastructure as Code Kubernetes Monitoring Networking Observability Optimization PostgreSQL Programming RBAC Recruiting Recruitment Relational databases Reliability Security Sentry Software Engineering SRE Talent Acquisition Technology Terraform Training

Experience

5 years

Education

Design Engineering