Site Reliability Engineer II
San Francisco, CA (Remote)
Our mission is to make higher education accessible and affordable for everyone. We empower students with financial support and supercharge their ability to pay down their debt, so they can get on the right financial track, fast.
We build tools that help people feel in control of their financial future, including:
- Private student loans - low rates, people-first service, and flexible payments.
- Student loan refinancing - break free from high-interest rates or monthly payments.
- Scholarships - access to thousands of scholarships to help students pay less.
Earnies are committed to helping students live their best lives, free from the stress of student debt. If you’re as passionate as we are about our mission, read more below, and let’s build something great together!
The Site Reliability Engineer II position will report to the Lead Cloud Engineer.
As an SRE II Engineer, you will:
- Set up and maintain comprehensive monitoring, create and refine playbooks, build dashboards, and adopt industry-standard practices to enhance the reliability and resilience of our site and systems.
- Develop and manage IaC to ensure reliable, scalable, and high-performance systems, reducing configuration drift and enabling rapid recovery.
- Implement and maintain both in-house and SaaS-based tools to measure SLOs, SLAs, and SLIs, ensuring we meet our reliability targets and provide transparency into system health.
- Identify opportunities for automation across the infrastructure to minimize manual interventions, streamline operations, and improve response times.
- Participate in on-call rotations, respond to incidents, conduct root cause analyses, and contribute to post-incident reviews to drive improvements.
- Work closely with cross-functional teams to enhance system design, support code deployments, and optimize system performance.
About You:
- 3+ years of professional experience in Site Reliability Engineering or a similar role, with a focus on infrastructure, automation, and system reliability.
- Hands-on experience with cloud providers (AWS), containerization (Kubernetes, Docker), CI/CD pipelines, and observability tools (e.g., Prometheus, Grafana or New Relic/Splunk).
- Willing to travel to the Oakland office monthly to engage with team members and strengthen collaboration.
- You enjoy learning new technologies, stay adaptable in a dynamic environment, and thrive in a team-oriented setting where shared goals are prioritized.
Even Better:
- Passionate about seeking opportunities to innovate and implement changes that enhance system reliability and client satisfaction.
- Champions self-service infrastructure solutions to empower development teams and accelerate deployment cycles.
- Embodies continuous improvement and is committed to driving projects beyond "good enough" toward operational excellence.
- Proactively identifies potential issues and implements preventive measures to ensure consistent system uptime.
- Able to clearly document processes and communicate with technical and non-technical stakeholders to ensure alignment.
Where:
- This role will be based in the San Francisco Bay Area.
- While you’ll enjoy the flexibility of remote work, we also love to see our Earnies face-to-face! We ask you to join us at our Oakland office for 3 consecutive days a month for team collaboration and some fun. It's a chance to connect, share ideas, and maybe even grab some coffee together!
#LI-NS1
A little about our pay philosophy: We take pride in compensating our employees fairly and equitably. We are showcasing a range of your potential base salary. The successful candidate’s starting pay will also be determined based on job-related qualifications, internal compensation, and budget. This range may be modified in the future.
Pay Range $155,000—$175,000 USDEarnest believes in enabling our employees to live their best lives. We offer a variety of perks and competitive benefits, including:
- Health, Dental, & Vision benefits plus savings plans
- Mac computers + work-from-home stipend to set up your home office
- Monthly internet and phone reimbursement
- Employee Stock Purchase Plan
- Restricted Stock Units (RSUs)
- 401(k) plan to help you save for retirement plus a company match
- Robust tuition reimbursement program
- $1,000 travel perk on each Earnie-versary to anywhere in the world
- Competitive days of annual PTO
- Competitive parental leave
What makes an “Earnie” culture:
- Drivers – Drivers are satisfied by making things happen, not coming along for the ride. They feel a strong sense of ownership for their projects and teams and demand high standards from themselves and others.
- Humility – Humble team players check their egos and consider the team’s needs above their own. They are self-aware of their strengths and opportunities for improvement.
- Growth Mindset – People with a growth mindset approach challenges and failures as learning opportunities. They seek feedback to improve, give feedback to others, and genuinely want to perform well.
At Earnest, we are committed to building an environment where our employees feel included, valued, and heard. Our belief is that a strong commitment to diversity, inclusion, equity, and belonging enables us to move forward with our mission. We are dedicated to adding new perspectives to the team and encourage anyone to apply if your experience is close to what we are looking for.
Earnest provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, sexual orientation, gender identity, veteran status, disability or genetics. Qualified applicants with criminal histories will be considered for the position in a manner consistent with the Fair Chance Ordinance.
ApplyJob Profile
Must travel to Oakland office monthly
Benefits/PerksEmployee stock purchase plan Parental leave PTO Savings plans Travel perk Tuition reimbursement Vision Benefits
Tasks- Build dashboards
- Conduct root cause analyses
- Create playbooks
- Enhance system design
- Identify automation opportunities
- Implement tools
- Manage IAC
- Participate in on-call rotations
- Set up monitoring
Automation AWS CI/CD Docker Grafana Infrastructure Kubernetes Monitoring New Relic Observability tools Performance Prometheus Site Reliability Engineering Splunk System reliability
Experience3 years
Education TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9