Sr. Staff Software Engineer - IaaS (Cluster Management) (REMOTE)
MD Chevy Chase (Office) - JPS
Position Description
Our Sr Staff Engineer works with our Distinguished Engineers, Staff Engineers, and Sr. Engineers to innovate and build new systems, improve, and enhance existing systems as well as identify new opportunities to apply your knowledge to solve critical problems. You will lead the strategy and execution of a technical roadmap that will increase the velocity of delivering products and unlock new engineering capabilities. The Cluster Management team is driving the development of the next-gen Kubernetes-based container cluster platform, prioritizing security, reliability, scalability, and efficiency. We seek a candidate with deep technical expertise in designing, building, and maintaining secure cluster management systems on OpenStack IaaS, at scale across physical and public cloud environments.
Position Responsibilities
As a Staff Engineer, you will:
Provide technical and thought leadership across diverse areas.
Collaborate with teams, customers, and product managers to address challenges.
Develop and execute a strategic software development plan for IaaS, encompassing containers, cluster management, Kubernetes, and OpenStack. Prioritize security, optimization for performance and efficiency across the entire development lifecycle.
Own solution quality, usability, and performance.
Mentor and exemplify technical excellence, influencing the engineering and product community.
Share best practices, refine processes, and drive continuous improvement.
Analyze costs, forecast, and integrate into business plans.
Determine resource needs, assess processes, and ensure adaptability for continuous learning.
Fulfill on-call responsibilities and offer operational support.
Qualifications
Proficient in multi-cluster networking using service mesh technologies like ISTIO, Consul, or Envoy.
Expertise in multi-cluster metrics, observability, and operations utilizing frameworks such as Grafana and Prometheus.
In-depth understanding of containerization technologies, including Docker, Podman, and Rancher.
Proficient with advanced technologies like ArgoCD, KubeVirt, and Cluster API (CAPI).
In-depth knowledge and practical experience in Linux operating systems, internals, and command-line utilities.
Proven expertise in optimizing CI/CD for streamlined Kubernetes deployment and configuration using GitOps and ArgoCD.
Hands-on experience in public and/or private cloud environments, including OpenStack, Kubernetes, Azure, AWS, and GCP.
Extensive experience in API, Microservices, network, and security architectures, incorporating design patterns.
Strong foundations in software engineering, encompassing the entire software delivery lifecycle.
Professional experience in software development using modern programming languages like Go, Python, or Java.
Experience in security protocols and products, including Active Directory, SAML, and OAuth.
Demonstrated ability to design and implement resilient, scalable, and efficient solutions.
Experience in building architecture and design, covering patterns, reliability, and scaling for both new and existing systems.
Fluent in DevOps concepts and cloud …
This job isn't fresh anymore!
Search Fresh JobsJob Profile
Benefits/PerksContinuous learning Dental Flexible hours Health and well-being Medical Paid training Paid Training and Licensures Paid Vacation Parental leave Remote work Total Rewards Program Tuition reimbursement Vision Insurance
Tasks- Collaborate with teams
- Drive continuous improvement
- Enhance existing systems
- Innovate and build new systems
- Mentor engineers
- Provide operational support
- Share best practices
Active Directory Algorithms API Architecture ArgoCD AWS Azure Building CI/CD Cloud Cloud Architecture Cloud Services Communication Consul Containerization Containers Data Structures Data structures and algorithms Deployment Design Patterns DevOps DevOps Concepts Docker Envoy GCP GitOps Go Grafana IaaS Istio Java Kubernetes Leadership Linux Microservices Networking OAUTH Observability OpenStack Operations PaaS Podman Problem-solving Prometheus Public Cloud Python Rancher Reliability SAML Scripting Security Security Architecture Security protocols Shell scripting Software Development Software Engineering Technical Roadmap
Experience5 years
EducationComputer Science Equivalent Equivalent Education Equivalent education or work experience Information Systems Work experience
Certifications Timezones