Found Description
What You Will Be Doing
- Taking part in the development of the NVIDIA's AI platform for training, fine-tuning and serving latest and greatest AI models with the best performance and efficiency.
- Designing and building solutions for scheduling large scale AI training and inference workloads on GPU clusters over many cloud infrastructure.
- Exploring and finding solution for open problems like industry-scale resource management, GPU scheduling, performance prediction, and live workload migration.
- Work with and contribute to adjacent teams like TensorRT/Dynamo inference engine, ML compiler, KAI/Grove scheduler, Lepton cloud etc.
What We Need To See
- Bachelor's degree or equivalent experience in Computer Science, Computer Engineering, relevant technical field.
- 5+ years of experience.
- Experience building large scale systems from scratch. Prior experience in container-based deployment systems lik...
Ready to Apply?
Submit your application for DL System Software Engineer - AI Platform at NVIDIA AI
Apply Now