Leave us your email address and we'll send you all the new jobs according to your preferences.

ML Engineer with Large Models Experience

Posted 17 hours 28 minutes ago by Talent Search PRO

Permanent
Full Time
Other
London, United Kingdom
Job Description
Requirements
  • Significant hands on experience optimizing deep learning models
  • Proven ability to profile and debug performance bottlenecks
  • Experience with distributed or large scale training and inference
  • Familiarity with techniques such as mixed precision, quantization, distillation, pruning, caching, and batching
  • Experience with large models (e.g., transformers)
  • Practical CUDA development experience
  • Deep understanding of at least one major deep learning framework (ideally PyTorch)
  • Experience building and operating ML systems on cloud platforms (AWS, Azure, or GCP)
  • Comfort working with experiment tracking, monitoring, and evaluation pipelines
Job Description

Optimize and own performance of AI/ML foundation model, design GPU components, reduce latency, and work with founders on optimization goals. Requires CUDA, Python, and deep learning expertise.

- Own the performance, scalability, and reliability of the company's foundation model in both training and inference.
- Profile and optimize the end-to-end ML stack: data pipelines, training loops, inference serving, and deployment.
- Design and implement GPU accelerated components, including custom CUDA kernels where off the shelf libraries are insufficient.
- Reduce latency and cost per inference token while maximizing throughput and hardware utilization.
- Work closely with the founders to translate product requirements into concrete optimization goals and technical roadmaps.
- Build internal tooling, benchmarks, and evaluation harnesses to help the team experiment, debug, and ship safely.
- Contribute to model architecture and system design where it impacts performance and robustness.

Email this Job