Site Reliability Engineer

Remote- Vancouver

Cover Genius

USD 134K+ Full Time Senior Mid

Company preview Company details

Apply Job profile

Published 2 weeks ago

The Company
Cover Genius is a Series E insurtech that protects the global customers of the world’s largest digital companies including Booking Holdings, owner of Priceline, Kayak and Booking.com, Intuit, Uber, Hopper, Ryanair, Turkish Airlines, Descartes ShipRush, Zip and SeatGeek. We’re also available at Amazon, Flipkart, eBay, Wayfair and SE Asia’s largest company, Shopee. Our partners integrate with XCover, our award-winning insurance distribution platform, to embed protection for millions of customers worldwide each year. Our team and products have been recognized with dozens of awards including by the Financial Times which ranked Cover Genius as the #1 fastest-growing company in APAC in 2020. Our diverse team across 20+ countries and many language groups commit itself to diverse cultural programs, in particular “CG Gives” which makes social entrepreneurs out of us all and funds development initiatives in global communities.
Our People areBold, Authentic, Purposeful and Inspired
Our People are notPerfect, Traditional, Complacent or Cautious
About the Role
As a Site Reliability Engineer on our Technology Team, you will own the reliable operation and continuous improvement of our production systems. Your primary purpose will be to ensure the seamless and secure functioning of our platforms and operations.
To drive success in this role, you will have a strong background in systems engineering and automation, with experience in release processes, observability, security, core network and infrastructure, and datastores and disaster recovery. You should possess excellent problem-solving skills, a keen attention to detail, and a proactive approach to identifying and mitigating potential issues.
As the Site Reliability Engineer, you will be responsible for: - Monitoring system health and ensuring operational stability and security - Automating and optimizing platform operations - Sharing ownership of production workloads with software engineering teams - Writing and maintaining technical documentation, including tutorials, guides, and blameless post-mortems - Designing and creating information dashboards based on logging and monitoring data - Collaborating with software engineers to drive automation, scalability, and efficiency across technology products and platforms
Regular collaboration with software engineering teams, security teams, and other relevant stakeholders will be key in ensuring the reliability and efficiency of our production systems are achieved.

What will your day look like? You will...

Analyze, test and modify systems to improve reliability and optimize performance particularly at an architectural/infrastructure level
Develop and maintain observability tooling and dashboards
Implement automation tools and frameworks, CI/CD pipelines, Reduce toil
Troubleshoot production issues and coordinate with the development team to streamline code deployments
Apply AWS and GCP knowledge and skills to create & maintain cloud infrastructure for software projects
Design, develop and implement software integrations
Collaborate with Software Engineers and other team members with the goal of improving engineering tools, systems, procedures and data security
Develop and maintain design and troubleshooting documentation and runbooks
Optimize and control costs of the company’s computing infrastructure

To help us level up, you'll ideally have:

Understanding of SRE Principles and best practices
Experience using & configuring modern observability tools such as ELK/EFK, Prometheus, Grafana
Comfortable scripting & developing internal tooling with Bash and at least one programming language (e.g. python, go)Experience working with infrastructure & configuration as code tools such as Terraform, Cloudformation, Chef, Puppet etc.
Experienced with container technology such as Docker and Ideally experienced with using and managing Kubernetes clusters
Experience working with Linux
Solid understanding of networking and system architecture
Solid understanding of how to deploy, scale and monitor web applications and databases
Good knowledge of AWS and/or GCP platforms and associated best practices
Bachelor's degree in Computer Science/Engineering, A postgraduate degree and/or record of academic achievement is also desirable

To be successful, you'll bring:

Strong communication and documentation skills
Curious and self motivated learner
Professional approach
Good team member
Organizational and time management skills
Excellent attention to detail
Positive approach to change

Why Cover Genius?
Cover Genius not only cares about being the best in our industry, we care about our team. We’re a business that understands life can be fluid and so we flex to ensure we provide the environment to suit that. What does that mean?
• Flexible PTO. Taking time out is important for our teams to enjoy life and stay fresh.• Employee Stock Options - we want our people to share in our success, we reward them with ownership for their contribution in creating a world-class company.• Work with like-minded people who are passionate about both the work we're doing and giving back. Our CG Gives programs enables us to all become philanthropists through our peer recognition and rewards system.• Social Initiatives - pictures speak a thousand words!
Salary Range
The base salary range for this role is between $134,000 and $176,000. The total compensation package also includes equity, and the opportunity for additional earnings through our annual bonus or variable commission plans.
We believe in transparency, and this salary range bracket is designed to provide a clear understanding of the potential earnings associated with this role. Your skills and contributions are highly valued, and we look forward to welcoming you to our team.
Sound interesting? If you think you have the best composition of the above, send us your resume and let's chat!
* Cover Genius promotes diversity and inclusivity. We don't tolerate discrimination, demeaning treatment of anyone, or harassment due to race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or any other legally protected status. Apply

Job Profile

Regions

North America

Countries

Canada

Tasks

Automate platform operations
Collaborate for automation and scalability
Design information dashboards
Monitor system health
Share workload with software engineering teams
Write technical documentation

Skills

Automation Bash scripting Datastores Disaster Recovery Network Infrastructure Networking Observability Release Processes Security Systems Engineering Terraform

Timezones

America/Edmonton America/Moncton America/Regina America/St_Johns America/Toronto America/Vancouver UTC-3 UTC-4 UTC-5 UTC-6 UTC-7 UTC-8