Found Description
About the role
The AI Research Division of Agile Robots is looking for an ML Platform Engineer (m/f/d), who will build and operate the distributed training, deployment, and experimentation infrastructure that research, data, and robotics teams depend on to move models from prototype to production.
Your Responsibilities
- Training Infrastructure: Design and scale distributed training workflows for large models using tools such as PyTorch Distributed, DeepSpeed, and cluster schedulers like SLURM or Kubernetes.
- ML Platform: Build and maintain containerised ML environments that support reproducible experimentation and benchmarking.
- CI/CD Pipelines: Develop and maintain CI/CD pipelines for mac...
Ready to Apply?
Submit your application for ML Platform Engineer (m/f/d) at Agile Robots SE
Apply Now