FreshRemote.Work

Distributed Systems Engineer (L5) - Compute Runtime

USA - Remote, United States

Netflix is one of the world's leading entertainment services, with 283 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

Netflix has been on the leading edge of cloud adoption since migrating to AWS 15 years ago and runs one of the largest Cloud footprints. The Cloud Engineering organization exists to manage that massive scale, constantly innovating to increase fleet-wide agility, efficiency, and reliability of the Netflix cloud infrastructure, while solving scale problems that we are the first to ever hit. We build, operate, and maintain Compute, Network, and Storage services so that developers at Netflix can rely on foundational building blocks when entertaining hundreds of millions of customers globally.

About the Team

The Compute Runtime team is responsible for the data plane runtime environment for our Kubernetes-based orchestrator, which handles millions of container launches per day.  We also provide the base OS and system services to hundreds of thousands of EC2 instances.  We thrive on solving complex problems and love sharing our learnings with our fellow engineers. Here is a short sample: “Debugging a FUSE deadlock in the Linux kernel”, “Investigation of a Cross-regional Network Performance Issue” and “Talking IPv4 to IPv6 without NAT

About the Role

We are seeking a highly skilled and accomplished engineer with demonstrable experience in evolving large-scale infrastructure systems and container runtimes on Linux.  The ideal candidate will bring a combination of leading innovative solutions across functional teams and hands-on development experience in AWS/cloud, Linux user-space, networking,  GPUs, and Kubernetes.

Key Responsibilities

  • Technical Delivery: Use your expertise to significantly advance the state of Netflix’s compute offerings for our single and multi-tenant partners.  

  • Strategic Planning: Evolve our infrastructure to meet Netflix’s business objectives around Streaming, Live events, and Gaming.  

  • Project Management: Lead your own and cross-functional teams to deliver on highly ambiguous and open-ended projects enforcing each stage of the Software Development Lifecycle framework.

  • Operational Excellence: Contribute to the ever-improving operational standards of our large-scale global services by applying engineering best practices and providing first-class on-call support.

  • Performance: Identify and resolve performance bottlenecks in the Linux networking stack and resource isolation components to optimize network traffic and minimize noisy neighbor issues for containers.

  • System Integration: Integrate Linux OS changes with user-space applications and container runtime, ensuring seamless operation within the Netflix ecosystem.

  • Presentation: Deliver write-ups, blog posts, and presentations at conferences such as Linux Plumbers and eBPF Summit to represent our Netflix engineering teams.

You will excel in this role with…

  • 4+ years of experience evolving Compute infrastructure for an organization and 8+ years of software engineering experience.

  • Technical expertise in:

    • Distributed systems at scale, preferably on AWS

    • Linux application development and related package managers

    • Go, Java, or C/C++

    • Containers & runtimes-as-a-service

    • Linux performance debugging

    • Basic Networking concepts 

  • Demonstrable experience delivering multiple strategic and ambiguous projects at scale.

  • Leading and influencing teams of 10+ peer engineers.

  • Excellent presentation, communication, and collaboration skills.

We are even more excited about…

  • Container Performance and Container Stack Contributions 

  • Familiarity with ML/AI concepts

  • Knowledge of GPU architecture, CUDA, and workload optimizations

  • AMI Management

*****************

Our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $100,000 - $720,000.

Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs.  Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more detail about our Benefits here.

Inclusion is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Job is open for no less than 7 days and will be removed when the position is filled.

Apply