Site Reliability Automation and Orchestration Engineer
6314 Remote/Teleworker US, United States
More About the Role:
The U.S. Navy’s Service Management, Integration, and Transport (SMIT) program has an opening for a Site Reliability Automation and Orchestration Engineer on a high-visibility DoD program that provides engineering support to the Navy Marine Corps Intranet (NMCI), the largest information technology (IT) network in the world. This position will provide many opportunities to challenge and grow your skills.
This person should be a seasoned, self-motivated, professional with hands-on engineering and testing experience in virtualized or on-prem environments. A skilled technician knowledgeable and experienced in infrastructure as code environments, automating patches and system configurations, orchestrating network operations and DevOps pipelines, automating and orchestrating operations, scaling releases, and day-to-day system operations in commercial cloud and on-prem environments.
What You'll Get to Do:
Provide continuous development and continuous integration to include site reliability engineering and integration, release management, implementation & migration, and training & knowledge transfer.
SRE Engineering and Integration:
•Automate routine tasks and software updates, document, and maintain functional, integration, security, and load/stress testing procedure.
•Document and deliver an approach for automated acceptance testing, integration/regression testing, all new applications, and all new capabilities.
•Develop Ansible playbook design, role development, and testing, in non-production environments, as well as migration to production networks.
Release Management:
•Participate in the development of the release management process, procedures, and policies.
•Establish, manage, update, and maintain the overall Release Management Plan and Release Schedule.
•Conduct site surveys, as necessary, to assess existing equipment and software being used to validate release package requirements and dependencies.
Training & Knowledge Transfer:
•Develop, document, and maintain work instruction materials for related Automation and Orchestration processes.
•Provide training when substantive technological changes are introduced.
You'll Bring These Qualifications:
•B.S. Degree and 8-12 years of hands-on site reliability engineering automation, ideally with federal government; add'l experience may be considered in lieu of degree.
•IAT Level II Baseline Certification (e.g. CCNA Security, CySA+, GICSP, GSEC, Security+ CE, CND, SSCP).
•Must be a US Citizen and possess an active Secret Clearance.
•Experience working in a DevOps, Continuous Delivery (CICD), and Agile environments using code delivery mechanisms, continuous build systems, code repositories, and continuous delivery solutions.
•Must be able to support program execution in classified environments and access SIPRNet from an NMCI location on short notice (local travel).
•Experience with automated script design, coding, debugging, and maintenance skills (using bash, python, etc.) preferred.
•Experience in CI/CD toolsets (e.g. Jenkins, GitLab, etc.).
•Experience with Containerization (Docker) and Container Orchestration (Kubernetes).
•Experience with chaos engineering practices and tools such as Chaos Monkey, Gremlin, or similar frameworks.
•Good command of Linux/Unix and command line knowledge.
•Experience in application administration, configuration, and integration.
•Familiarity with agile development methodologies.
•Skilled and disciplined to work with a distributed team.
•Ability to work in a highly collaborative, forward thinking, and innovation-driven environment.
•Knowledge of Agile and DevSecOps/SRE concepts and best practices, with a desire to grow that knowledge.
•Hand-on experience with Atlassian products (Jira, Confluence, Bitbucket, etc.).
•Experience creating JIRA and/or Azure DevOps workflows, projects, custom configurations.
•Experience administrating/maintaining SRE platform via Ansible playbooks (e.g. upgrading Jenkins).
•Experience in automating tasks with scripting languages like PowerShell, or Python.
•Integrating/maintaining with various 3rd party CI/CD tools like Jenkins and Gitlab.
•Experience with PaaS using Red Hat OpenShift/Kubernetes and Docker containers.
•Experience with commercial cloud infrastructure deployment environments such as AWS and Azure.
•Experience with automated provisioning and configuration tools like Terraform, Cloud Formation, Chef, Puppet, Ansible, or similar technologies.
•Working knowledge of the Risk Management Framework (RMF), DISA STIGs.
These Qualifications Would be Nice to Have:
•Experience with Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation for automating test environments.
•ITILv4, Scrum Master, or Agile SAFe certification(s) or applicable experience.
Original Posting Date:
2024-12-16While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above.
Pay Range:
Pay Range $89,700.00 - $162,150.00The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.
ApplyJob Profile
Active Secret Clearance Required Local travel may be necessary Must be a U.S. Citizen Remote/Teleworker US
Benefits/PerksChallenging work Innovation Opportunities for growth Remote work flexibility Training
Tasks- Automate routine tasks
- Conduct site surveys
- Configuration
- Debugging
- Design
- Develop
- Develop Ansible playbooks
- Development
- Document testing procedures
- Infrastructure as Code
- Maintain
- Manage release processes
- Provide training
- Risk Management
- Test
- Testing
Administration Agile Agile Development Ansible Atlassian Automation AWS Azure Azure DevOps Bash Best Practices Bitbucket Business Chaos Engineering Chef CI/CD Cloud Cloud formation CloudFormation Cloud Infrastructure CND Coding Compensation Configuration Confluence Containerization Container Orchestration Containers Continuous delivery Continuous Integration Data Debugging Deployment Design DevOps DevSecOps DISA Docker DOD Education Engineering Execution GitLab IaC Implementation Information Technology Infrastructure Infrastructure as Code Innovation Instruction Integration IT Jenkins Jira Kubernetes Linux Maintenance Management Network Network Operations OpenShift Operations Orchestration PaaS PowerShell Program Execution Provisioning Puppet Python Red Hat Red Hat OpenShift Regression testing Release Management Risk Management Risk Management Framework RMF SAFe Scripting Scripting Languages Scrum Security Security+ CE Service Management Site Reliability Engineering Site surveys Software Support Technology Terraform Testing Training UNIX Workflows
Experience8-12 years
EducationAS B.S. degree Business Degree Design Engineering Information Technology IT Master Security Technology
CertificationsActive Secret Clearance AWS CCNA CCNA Security CND CySA+ DevOps GICSP GSEC IAT Level II Scrum Master Secret clearance Security+ CE SSCP
TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9