Solutions Architect, HPC Systems Engineer
US, CA, Santa Clara, United States
NVIDIA is looking for an experienced GPU and network systems Solutions Architect & Engineer. Do you want to be part of a team that brings new Artificial Intelligence (AI) hardware and software technologies to production in customer data centers? As part of the NVIDIA SA organization, you will be driving deployment of our end-to-end technology solutions integration at some of NVIDIA's most strategic technology customers, as well as offering recommendations to business and engineering teams on our product roadmap.
What you will be doing:
Working with NVIDIA AI Native, Consumer Internet and IT Services customers on large data center GPU server and networking system deployments as Solution Architect Engineer. Guide customer discussions on network design, compute/storage and support bring up of server/network/cluster deployments. You will need to visit customer data center during bring up phase.
Demonstrate subject matter expertise in advanced GPU & network systems and be a trusted technical advisor to NVIDIA's strategic customers. Bring customer-specific requirements to product teams to guide product roadmap features.
Identify new project opportunities for NVIDIA products and technology solutions in data center and artificial intelligence applications. Work closely with the GPU/Network Systems Engineering, Product management and Sales teams
Work as customer trusted advisor conducting regular technical customer meetings for product roadmap, cluster issues debug, feature discussions and introduction to new technology solutions
Build custom product demonstrations and POCs for solutions that address critical business needs of our customers
Analyze and debug compute/network configuration, performance issues to deliver performant clusters
What we need to see:
BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Physics, or other Engineering fields or equivalent experience.
This role is for an individual with the motivation and skills to drive the data center engineering process. Ideal candidate has 5+ years of Systems/Solution Engineering (or similar Engineering roles) experience
System level expertise of CPU/GPU server architecture, NICs, Linux, system software and kernel drivers
Experience with networking switches for Ethernet/Infiniband, and Data Center infrastructure (power/cooling)
Knowledge of DevOps/MLOps technologies such as Docker/containers, Kubernetes
Effective time management and capable of balancing multiple tasks
Strong verbal/written communication skills and share your ideas/code clearly through documents, presentation etc
Ways to stand out from the crowd:
External customer facing background
Experience with bringup and deployment of large clusters
Systems engineering, coding, and debugging skills including experience with C/C++, Linux kernel and drivers
Hands-on experience with NVIDIA GPU systems/SDKs (e.g. CUDA), NVIDIA Networking technologies (e.g. NICs, RoCE, InfiniBand), and/or ARM CPU solutions
Familiarity with virtualization technology concepts
We make extensive use of conferencing tools, but occasional (20%) travel is required for on-site visit to customers and industry events. We are open to remote work location and look forward to have you join our team!
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!
The base salary range is 148,000 USD - 235,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. ApplyJob Profile
20% travel required
Benefits/PerksBenefits Diversity Eligible for Equity Equity Equity and benefits Innovative environment Remote work Work environment
Tasks- Analyze/debug configurations
- Build product demonstrations
- Conduct customer meetings
- Deploy AI hardware/software
- Guide network design
- Product demonstrations
AI ARM Artificial Intelligence C C++ Communication Compute Containers CUDA Data center Data Center Infrastructure Data centers Debugging Deployment DevOps Docker Engineering Ethernet GPU HPC Infiniband Infrastructure Integration Kubernetes Linux Management MLOps Network Configuration Networking NVIDIA Organization Presentation Product Management Product Roadmap Recommendations Sales SDKs Storage Support Systems Engineering System Software Technology solutions Time Management Virtualization
Experience5 years
EducationArtificial Intelligence B.S. Computer Engineering Computer Science Electrical Engineering Engineering Equivalent Equivalent experience MS Ph.D. Physics
TimezonesAmerica/Anchorage America/Chicago America/Denver America/Los_Angeles America/New_York Pacific/Honolulu UTC-10 UTC-5 UTC-6 UTC-7 UTC-8 UTC-9