FreshRemote.Work

Staff Software Engineer, Infrastructure Architecture - AI/ML

San Francisco

Do you ever wonder what happens inside the cloud?

DigitalOcean (NYSE: DOCN) simplifies cloud computing so builders can spend more time creating software that changes the world. With our mission-critical infrastructure and fully managed offerings, DigitalOcean enables startups and small and medium-sized businesses (SMBs) to rapidly deploy and scale modern applications. As a remote-first organization, our employees, like our customers, are based around the world.

We want people who are passionate about designing and operating secure systems at scale.

We are looking for someone passionate about delivering a world class GPU experience for our developer cloud. If you’re an open source advocate familiar with our stack, who enjoys working remotely and is excited about our mission, this role is for you!

At DigitalOcean, we believe in: Creating simple, yet powerful, foundations (with 💕) from which our community can build. The Infrastructure Fleet Organization delivers on this mission by building performant, reliable, modern, efficient, and secure platform foundations for all DigitalOcean products.

Our Stack: C/C++, Python, Go, Linux, libvirt, KVM, QEMU, CEPH

Our Tools: AWX, Chef, Elasticsearch, Git, Github Actions, GSuite, Jira, Nomad, Slack, Victoria Metrics

Our Team: The person filling this position will report to the Sr. Engineering Director of the Infrastructure Fleet Organization (Infra::Fleet). Infra::Fleet is currently composed of 7 teams and is made up of 60 diverse engineers located across the US, Canada, and Europe.  

What You’ll Be Doing:

  • Work with your fellow sharks to design, develop, and optimize the next generation of virtualized GPU infrastructure
  • Work with customers and stakeholders to define and refine infrastructure requirements needed to support their AI/ML workload
  • Work with infrastructure technical leaders to define infrastructure requirements to store, move, and manipulate large datasets
  • Guide performance teams on industry standard testing methodologies and help optimize for GPU fabric throughput   
  • Identify security improvements and drive review discussions with internal teams
  • Influencing a culture of engineering excellence through active engagement with DigitalOcean’s Architecture group
  • Working directly with individual engineering teams to deliver new infrastructure functions and technologies in support of DigitalOcean AI/ML products
  • Drive technical strategy that influences medium and long term roadmaps
  • 5-20% of your time is spent contributing to open source communities related to our stack and encouraging your fellow sharks to do the same

What We’ll Expect From You:

  • Experience …
This job isn't fresh anymore!
Search Fresh Jobs