Staff Software Engineer, Infrastructure Engineering
Berkeley or Remote
Responsibilities
- Help design, implement, and maintain foundational services, frameworks, and libraries for data and compute management
- Work in cooperation with multiple teams to expand and mature our cloud presence
- Develop and enforce company-wide standards related to code quality, code organization, and codebase management
- Work with DevOps to maintain our CI/CD pipelines and ensure that developers can build, test, and deploy changes in a fast and high-quality manner
- Leverage telemetry toolkits to track core performance and reliability metrics of the services we own. Help fulfill rigorous SLAs and performance guarantees
- Help the company survey, evaluate, and adopt major new technologies. Design, coordinate, and execute company-wide adoption plans
- Lead complex multi-team projects. Collaborate across DevOps, SysOps, Engineering, and Research teams to deliver a cohesive platform across research and production
- Mentor and develop other engineers on the team, and share your practices and knowledge with the team and company
Requirements
- Computer Science Degree or equivalent experience
- 7+ years of software engineering experience building high-performance systems
- Experience operating and scaling mission-critical, large-scale production systems in languages such as Python, Go and C++
- Excellent communication and project management skills in complex technical domains
- Track record mentoring engineers and leading technical direction
Preferred Qualifications
- Experience with ML research platforms and associated frameworks for data processing, batch computing, and research (e.g., Apache Airflow, Kubeflow, Slurm, AWS/GCP Batch, Spark, Dask)
- Expertise in CI/CD, build systems, and best practices for large codebase management (Bazel, Jenkins, Github Actions)
- Exposure to cloud platforms, cloud-native architectures, and tools like Infrastructure-as-Code for managing cloud infrastructure (Terraform, Pulumi, CloudFormation)
Job Profile
20 days of paid time off 401(k) plan with company match Candidate referral program Collaborative environment Dental coverage Life and AD&D insurance Life Insurance Medical coverage Paid Time Off Sick Days Vision coverage
Tasks- Design and maintain services
- Enforce code quality standards
- Expand cloud presence
- Lead multi-team projects
- Maintain ci/cd pipelines
- Mentor engineers
- Track Performance Metrics
Airflow Apache Airflow AWS Bazel C C++ CI CloudFormation Communication Dask Finance GCP GitHub Actions Go Investment Management Jenkins Kubeflow Machine Learning Mentoring Project Management Pulumi Python Research SLURM Software Engineering Spark Terraform
Experience7 years
EducationComputer Science Computer Science Degree Equivalent experience Finance
TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9