Machine Learning Engineer
Posted 3 hours 1 minute ago by Harrington Starr
Machine Learning Engineer - Large-Scale Systems & Real-Time Inference
An opportunity for a Machine Learning Engineer to design and deliver large-scale systems supporting the training, optimisation and deployment of advanced models in high-performance environments. You will collaborate with researchers, software engineers and hardware specialists to build solutions that push the limits of GPU acceleration, distributed computing and automated experimentation.
Key Responsibilities
Design and implement distributed training pipelines handling high-volume data and complex model architectures
Develop low-latency inference systems providing real-time, high-accuracy predictions in production
Optimise and extend machine learning frameworks to improve training and inference performance
Leverage GPU programming (CUDA, cuDNN, TensorRT) to maximise efficiency
Automate model experimentation, tuning and retraining in partnership with research teams
Work with infrastructure specialists to optimise workflows and reduce compute costs
Assess and integrate emerging open-source tools to enhance ML development and deployment
Skills and Experience
5+ years' experience in machine learning with a focus on training and inference systems
Strong programming expertise in Python and C++ or CUDA
Proficiency with PyTorch, TensorFlow or JAX
Hands-on experience with GPU acceleration and distributed training (Horovod, NCCL or similar)
Background in real-time, low-latency ML pipelines
Familiarity with cloud and orchestration technologies
Contributions to open-source ML or distributed systems projects are advantageous
What's on Offer
Work on technically demanding, high-impact ML systems at global scale
Collaborate with a multidisciplinary team of experts in research, engineering and HPC
Join a culture that prizes innovation, precision and technical excellence
Competitive remuneration, benefits and clear scope for progression