Site Reliability Engineer
Mountain View - United States or Remote - Mountain View, California 94041 United States; Remote - Remote
Overview
Love staying ahead of the growth curve and experimenting with new software and environments? Get on board as an Atlassian Site Reliability Engineer.
Responsibilities
As a Site Reliability Engineer (SRE) you will actively work to improve the performance and reliability of services as well as address root causes of incidents and reduce incident rates
You will deep dive into the services we support and own the problem and the corresponding solution, as well as automating away repetitive work.
You'll also respond to pings, pages, and alerts to investigate issues in our systems that you can really sink your teeth into.
The best person for this role is someone who has a collaborative spirit - in our world, it’s not about being a hero and having all the answers, it’s about sometimes saying "I don't know" and working on finding solutions rather than starting with an assumption.
The team needs someone who can ask questions, learn from others and turn chaos into order. You will serve in an on-call weekly rotation to make sure our products meet established SLAs.
This role would be a great fit for someone with creative and innovative problem solving skills with a willingness to take responsibility for the code you write all the way to production. You will develop and implement solutions that operate at scale - seeing your own technology efforts directly improve the reliability of our services. Our teams are empowered and expected to improve our products to truly deliver a reliable experience to customers. You will own development efforts in each and every sprint from planning to delivery to realise this goal and collaborate with different team members to review code.
One thing we promise: you’ll never be bored.
Qualifications
On your first day, you will have experience in:
Writing code in Bash and Python
Triaging and diagnosing user facing service outages
Engage in capacity planning, demand forecasting, software performance analysis, and systems tuning.
Experience configuring and managing enterprise monitoring solutions
Understanding of Linux systems
Building, automating, and maintaining infrastructure in Amazon Web Services
Maintaining a high standard of code quality
We'd be super excited if you have:
Exposure to and maintenance of configuration management and orchestration tools such as Ansible and Puppet
Experience with container management and microservices architectures such as Docker and Kubernetes
Understanding of ITIL terminology for incident and problem management
Management and troubleshooting of a continuous integration pipeline
…
This job isn't fresh anymore!
Search Fresh JobsJob Profile
Community engagement support Competitive compensation Health coverage Paid volunteer days Variety of perks Wellness resources
SkillsAnsible AWS Bash CI/CD Collaboration Docker FedRAMP ITIL Kubernetes Linux Puppet Python SOC2
Experience3 years
TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9