FreshRemote.Work

Staff Software Engineer - Inference

Remote (US & CAN)

Lambda's GPU cloud is used by deep learning engineers at Stanford, Berkeley, and Carnegie Mellon. Lambda's on-prem systems power research and engineering at Intel, Microsoft, Kaiser Permanente, major universities, and the Department of Defense.

If you'd like to build the world's best deep learning cloud, join us.

What You’ll Do

  • Help design, build and improve our new inference and ML computation platform, acting as technical lead for one or more of our implementation teams.
  • Work with management, product and other internal business partners to drive technical decisions based on business and market needs
  • Work on the architecture of our distributed systems to ensure best-in-class reliability and efficiency, while helping to minimize operational costs and toil work
  • Provide your team empathetic leadership as well as mentorship to grow their own skills and abilities
  • Build products around a large range of ML models and types, including industry-leading research
  • Help build safety and fraud systems, around both inference and other ML systems
  • Handle interesting and dynamic scaling, hardware and scheduling challenges in a very dynamic and rapidly changing industry sector

You

  • Are an experienced lead software engineer with ten or more years of working on business-critical distributed systems.
  • Have a history of leading projects from inception to production, including making technical decisions, authoring design and decision documents, and advising on staffing needs.
  • Have significant experience architecting systems around relational databases, document databases, queue datastores, block storage, object storage, unreliable networks, and caches.
  • Have a deep understanding of the balance between initial build costs and operational costs, and what it takes to launch a product quickly but with a good technical foundation.
  • Can write both Go and Python to a high level, and can pick up other languages as needed.
  • Are very familiar with building integrated test frameworks and using CI/CD systems
  • Are product-oriented and focused on great user experiences, and are invested in building the best product possible for users.
  • Are good at working cross-functionally and solving problems across teams, including empathetic conflict resolution when working alongside teams with different priorities.
  • Have recent team leadership experience (on a team of four or more people) 

Nice to Have

  • Experience writing Kubernetes operators or other Kubernetes integrations
  • Experience running ML/GPU workloads in production
  • Experience with computation dispatch and …
This job isn't fresh anymore!
Search Fresh Jobs