Job Description
Become a pivotal part of NVIDIA's team as a Senior Software Engineer specializing in AI inference optimization. Your skills in GPU kernel development and benchmarking will play a crucial role here.
This role demands seasoned software engineers dedicated to refining AI inference systems. You will actively participate in architecting and optimizing the vLLM inference framework, focusing on high-performance computing across GPU clusters. Your collaboration with various teams will help push the boundaries of accelerated computing.
Key Responsibilities: • Enhance vLLM's features to optimize new models • Benchmark and optimize GPU kernels using advanced methods • Create methodologies for industry-leading benchmarking tools • Design orchestration for large-scale inference deployments • Conduct original research for ML Systems advancements
Requirements: • PhD with top publications in ML Systems or relevant field • Expertise in programming with Python and C/C++ • Know...
This role demands seasoned software engineers dedicated to refining AI inference systems. You will actively participate in architecting and optimizing the vLLM inference framework, focusing on high-performance computing across GPU clusters. Your collaboration with various teams will help push the boundaries of accelerated computing.
Key Responsibilities: • Enhance vLLM's features to optimize new models • Benchmark and optimize GPU kernels using advanced methods • Create methodologies for industry-leading benchmarking tools • Design orchestration for large-scale inference deployments • Conduct original research for ML Systems advancements
Requirements: • PhD with top publications in ML Systems or relevant field • Expertise in programming with Python and C/C++ • Know...