ML Engineer - LLM Training & Optimization
Posted 1 day 9 hours ago by Lenovo
We are Lenovo. We do what we say. We own what we do. We WOW our customers.
Lenovo is a US$69 billion revenue global technology powerhouse, ranked in the Fortune Global 500, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver Smarter Technology for All, Lenovo has built on its success as the world's largest PC company with a full stack portfolio of AI enabled, AI ready, and AI optimised devices (PCs, workstations, smartphones, tablets), infrastructure (server, storage, edge, high performance computing and software defined infrastructure), software, solutions, and services. Lenovo's continued investment in world changing innovation is building a more equitable, trustworthy, and smarter future for everyone, everywhere.
Lenovo is listed on the Hong Kong stock exchange under Lenovo Group Limited (HKSE: 992) (ADR: LNVGY).
Description and RequirementsThis role is open for the Edinburgh, Scotland location only. Candidates must be based there, as the position requires working from the office at least three days per week (3:2 hybrid policy).
The Lenovo AI Technology Center (LATC)-Lenovo's global AI Center of Excellence-is driving our transformation into an AI first organisation. We are assembling a world class team of researchers, engineers, and innovators to position Lenovo and its customers at the forefront of the generational shift toward AI.
We are seeking a highly motivated and skilled AI Cloud Engineer to join our rapidly growing AI team. You will play a critical role in the training of large language models (LLMs), large vision models (LVMs), and large multimodal models (LMMs), including fine tuning and reinforcement learning. This is a challenging yet rewarding opportunity to contribute to cutting edge research and development in generative AI.
Responsibilities- Design, implement, and evaluate training pipelines for large generative AI models, encompassing multiple stages of post training.
- Design, implement, and evaluate data augmentation pipelines to increase the diversity and robustness of training datasets, improving model performance, particularly in low data regimes.
- Develop and implement adversarial training techniques to improve model robustness against adversarial attacks and enhance generalisation performance by exposing the model to perturbed input examples during training.
- Develop and execute supervised fine tuning (SFT) strategies for specific tasks.
- Run and refine reinforcement learning from human feedback (RLHF) pipelines to align models with human preferences.
- Design and implement model pruning strategies to reduce model size and computational complexity.
- Develop and perform model distillation techniques to compress large language models into smaller, more efficient models while preserving key performance characteristics.
- Implement and evaluate model quantisation techniques to reduce model size and accelerate inference speed.
- Utilise low rank adaptation (LoRA) techniques for efficient fine tuning of large language models.
- Experiment with various training techniques, hyper parameters, and model architectures to optimise performance and efficiency.
- Develop and maintain data pipelines for processing and preparing training data.
- Monitor and analyse model training progress, identify bottlenecks, and propose solutions.
- Stay up to date with the latest advancements in large language models and related technologies.
- Collaborate with other engineers and researchers to design, implement, and deploy AI powered products.
- Contribute to the development of internal tools and infrastructure for model training and evaluation.
- Bachelor's or Master's degree in Computer Science, Machine Learning, or a related field and 5+ years of relevant work experience (or 7+ years).
- Strong programming skills in Python and experience with deep learning frameworks such as PyTorch.
- Solid understanding of machine learning principles, including supervised, unsupervised, and reinforcement learning.
- Proven experience designing and conducting experiments, analysing data, and drawing meaningful conclusions.
- Familiarity with large language models, transformer architectures, and related concepts.
- Experience with data processing tools such as Pandas, NumPy.
- Experience working with Linux systems and/or HPC cluster job scheduling (e.g., Slurm, PBS).
- Excellent communication, collaboration, and problem solving skills.
- Ph.D. in Computer Science, Machine Learning, or a related field.
- Experience with distributed training frameworks (e.g., DeepSpeed, Megatron LM).
- Opportunities for career advancement and personal development.
- Access to a diverse range of training programmes.
- Performance based rewards that celebrate your achievements.
- Flexibility with a hybrid work model (3:2) that blends home and office life.
- Electric car salary sacrifice scheme.
- Life insurance.
We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, colour, sex, age, religion, sexual orientation, gender identity, national origin, status as a veteran, or basis of disability or any federal, state, or local protected class.
Additional Locations: United Kingdom - Renfrewshire - Renfrew.
We use AI based tools to support some of our processes (e.g. online interview recordings and transcripts) in order to achieve better efficiency, accuracy, and for our documentation purposes. AI can make mistakes, but we always make sure that the outputs are manually reviewed by a human. You can always opt out or contact us in case of any question.
If you require an accommodation to complete this application, please contact .