FreshRemote.Work

Senior C++/Deep Learning Engineer, GPU Optimization

Remote - US

About the Company

At Torc, we have always believed that autonomous vehicle technology will transform how we travel, move freight, and do business.

A leader in autonomous driving since 2007, Torc has spent over a decade commercializing our solutions with experienced partners. Now a part of the Daimler family, we are focused solely on developing software for automated trucks to transform how the world moves freight.

Join us and catapult your career with the company that helped pioneer autonomous technology, and the first AV software company with the vision to partner directly with a truck manufacturer.

Meet the team 

Torc's virtual driver software utilizes cutting-edge deep learning techniques to perceive the vehicle's environment, predict the movements of other vehicles, and execute accurate driving decisions. We are actively seeking a highly experienced senior engineer to join the hardware acceleration team. This is an exceptional opportunity for you to have a significant impact on the future of the autonomous vehicle industry by enhancing AI performance. 

What you'll do: 

  • Optimize machine learning inference models for NVIDIA Orin execution 
  • Leverage data parallelism and CUDA programming 
  • Implement tensorrt plugins 
  • Stay abreast of the latest advancements in PyTorch, maximizing their potential for target hardware execution 
  • Collaborate with machine learning engineers to develop innovative and performant deep learning solutions 
  • Analyze and optimize deep learning inference using profiling and optimization tools, identifying and eliminating performance bottlenecks 
  • Contribute to the development of internal tools and libraries to further enhance deep learning performance on the target hardware 
  • Document your work clearly and concisely, sharing knowledge effectively with team members 

What you’ll need to Succeed: 

  • Bachelor's degree in computer science, data science, artificial intelligence or related field with 6+ years of professional experience or a master's degree with 3+ years of experience 
  • Mastery of Modern C++ (14 or more recent) and Python, with the ability to write efficient and maintainable code for both performance and flexibility 
  • Familiarity with object-oriented software design patterns, and their implementation in C++ 
  • In-depth knowledge of CUDA programming and experience with optimizing deep learning kernels 
  • Excellent understanding of parallel computing (GPGPU) and high-performance (HPC) concepts 
  • …
This job isn't fresh anymore!
Search Fresh Jobs