Site Reliability Engineer
Remote
Fulfil is a well-funded, rapidly growing, and inclusive company that has developed a custom automation robotics system to pick and pack online orders of groceries and other consumables, bringing delight and a value proposition to consumers that doesn’t exist today. Additionally, its unique design and technology is purpose-built to solve today’s environmental problems in the world’s food supply chain. Founded by a team with previous startup success and backed by top-tier VCs, we are committed to reducing waste, improving environmental impact, and reducing emissions with truly new technology. Our first commercial product launch with the technology is scheduled for summer 2022.
We can’t do it alone -- we’re seeking curious, capable, passionate team members motivated by the opportunity to create lasting impact on the world through their work. This role offers ample growth opportunities while working side-by-side with an impassioned, multi-disciplinary team spanning mechanical design, software, computer vision, systems integration, and ops to design and operationalize world-changing technology.
Fulfil is committed to creating an inclusive culture, and we celebrate diversity of all kinds. If this sounds like the kind of environment that you find intriguing, then please apply even if you don’t feel you meet all the requirements listed below. We'd love to hear from you.
Why you’ll love working at Fulfil
- Autonomy and ownership; you design it, build it, and own it.
- Modern technologies
- Work closely with robots on a practical and impactful product
- No spaghetti code here; Best-in-class testing/logging / monitoring / and deployment infrastructure
- Rapid growth
- Inclusive culture
- State-of-the-art simulation infrastructure
- Work-life balance
- Collaborative culture
- We hire great people and trust them
- Natively remote team that will be remote forever.
What You’ll Do:
- Be on a PagerDuty rotation to respond to availability incidents and provide support for developers and the business.
- Build, manage, and maintain our cloud infrastructure with, Kubernetes, flux, and other tools.
- Build and maintain automated configuration management.
- Help plan the growth trajectory of Fulfil's infrastructure.
- Help ensure we're following industry best practices.
- Actively participate in incident response in the wake of production issues.
- Build and assist with CI/CD deployments and application observability
Our stack
- Cloud: GCP
- Orchestration: Kubernetes
- GitOps: Flux
- Monitoring: Grafana stack
- CI/CD: Github actions
- Containers: Docker / containerd
Minimum requirements
- BS degree in CS, Software Engineering or related field // or equivalent experience.
- You won't be successful unless you are extremely comfortable with Kubernetes.
- You need to be comfortable with Prometheus, Grafana, Loki, and Alert manager.
- Not afraid of robots
- At least 5 years of experience maintaining complex high-volume applications in a production environment
- Solid coding in Python, Node, Ruby, Go, or other high-level languages.
- Solid understanding of InfoSec best practices
- Experience with AWS || GCP || Azure
Bonus Qualifications
- Network administration
- Network troubleshooting
- MetalLB
- Metal3
- Experience managing on-prem and cloud clusters
Seniority and Compensation
Seniority: Senior to Staff level.
Compensation: the salary range for this position is $145,000- $185,000 plus stock and benefits depending on experience and location. Pay within the range is based on candidate experience, job-specific skills, education, and work location.
Look... we seek out great engineers with a diverse set of skills, from different backgrounds, who learn quickly and are open to solving a wide variety of problems. If this sounds like an environment you would thrive in we encourage you to apply.
ApplyJob Profile
Benefits/PerksGrowth Opportunities Inclusive culture Remote work Work-life balance
Tasks- Assist with CI/CD deployments
- Maintain automated configuration
- Manage cloud infrastructure
- Participate in incident response
- Plan infrastructure growth
- Respond to incidents
Alert Manager AWS Azure Containerd Docker Flux GCP GitHub Actions Go Grafana Kubernetes Loki Metal3 MetalLB Network administration Network troubleshooting Node.js Prometheus Python Ruby
Experience5 years
Education