GPU Benchmarking Engineer (DRW portfolio company)
Remote
Location: Remote
A DRW portfolio company is seeking a highly skilled and motivated lead GPU benchmarking Engineer to join their team. The ideal candidate will have extensive hands-on experience with GPU hardware, benchmarking tools, performance analysis, programming, and automation. This role involves designing and executing rigorous testing protocols to assess the reliability of GPUs, as well as leading the development and implementation of comprehensive GPU benchmarking frameworks. The candidate should also have the potential to lead and operate at a larger scope, with an eye towards leadership roles such as Chief Technology Officer (CTO).
Key Responsibilities:
- Test Design and Execution:
- Develop and implement comprehensive test plans to evaluate GPUs under prolonged heavy workloads using stress testing software.
- Monitor key metrics such as frame rates, temperature, peak and average power consumption, Peak Flops, Sustained Flops, cross-node bandwidth, and stability over time.
- Benchmark GPUs using industry-standard benchmarking tools to measure and analyze performance.
- Provide leadership and mentorship to a team of engineers, fostering a culture of innovation and technical excellence.
- Data Collection and Analysis:
- Conduct baseline tests on new GPUs to establish initial performance benchmarks.
- Track performance metrics over time to detect and analyze any degradation.
- Utilize GPU driver APIs to collect low-level telemetry during various operational conditions.
- Performance Comparison and Validation:
- Compare performance metrics across different cluster configurations to identify comparative strengths and weaknesses.
- Perform statistical analyses to ensure the validity and reliability of the test results.
- Repeat tests to ensure consistency and accuracy of data.
- Reporting and Documentation:
- Prepare detailed reports outlining test setups, methodologies, and data-driven conclusions.
- Clearly communicate findings, insights, and recommendations to team members and stakeholders.
- Cloud Computing Integration:
- Configure, deploy, and maintain cloud infrastructure for automation, orchestration, and integration.
- Utilize cloud computing resources to create scalable and efficient testing environments.
- Optimize cloud platform usage for benchmarking and data analysis tasks.
Required Qualifications:
- Bachelor's degree in Computer Science, Electrical Engineering, or a related field.
- Proven experience in compute benchmarking, stress testing, and performance analysis.
- Proficiency with benchmarking tools such as 3DMark, CUDA, OpenCL benchmarks, FurMark, MSI Kombustor, SPECviewperf, Unigine Heaven, and Superposition Benchmark.
- Strong understanding of GPU clusters architectures and relevant performance metrics.
- Experience with using …
This job isn't fresh anymore!
Search Fresh JobsJob Profile
Benefits/PerksCareer growth opportunities Collaborative environment Competitive salary Cutting-edge technology Remote-first company
Tasks- Analyze performance metrics
- Develop test plans
- Lead and mentor engineers
- Prepare reports
Automation Bash C C++ Cloud Computing Communication CUDA Leadership Performance analysis PowerShell Python
EducationBachelor's degree Bachelors degree in a related field Bachelor's degree in Computer Science Bachelor's degree in Electrical Engineering Computer Science Engineering