N

Software Engineer for AI Inference Optimization

NVIDIA Gruppe

toronto, on, Canada Full-time June 13, 2026

Found Description

Become a pivotal part of NVIDIA's team as a Senior Software Engineer specializing in AI inference optimization. Your skills in GPU kernel development and benchmarking will play a crucial role here.
This role demands seasoned software engineers dedicated to refining AI inference systems. You will actively participate in architecting and optimizing the vLLM inference framework, focusing on high-performance computing across GPU clusters. Your collaboration with various teams will help push the boundaries of accelerated computing.
Key Responsibilities:
• Enhance vLLM's features to optimize new models
• Benchmark and optimize GPU kernels using advanced methods
• Create methodologies for industry-leading benchmarking tools
• Design orchestration for large-scale inference deployments
• Conduct original research for ML Systems advancements
Requirements:
• PhD with top publications in ML Systems or relevant field
• Expertise in programming with Python and C/C++
...

Ready to Apply?

Submit your application for Software Engineer for AI Inference Optimization at NVIDIA Gruppe

Apply Now