C

Senior LLM Inference & GPU Optimization Engineer

Confidential

singapore, singapore, Singapore Full-time June 24, 2026

Found Description

Confidential is looking for an expert in optimizing large language model (LLM) performance. In this role, you will optimize LLM inference for cost, latency, and throughput while profiling and tuning GPU performance at a deep level. Collaboration with the model and platform teams is essential to enhance architecture performance.

Deep experience in deep-learning inference optimization, hands-on GPU programming, and fluency in modern LLM serving stacks are required. This role is crucial for high-performance production environments.

#J-18808-Ljbffr

Ready to Apply?

Submit your application for Senior LLM Inference & GPU Optimization Engineer at Confidential

Apply Now