HPC Engineer

Posted 18 hours 9 minutes ago by Arcus Search

Permanent
Not Specified
Other
England, United Kingdom
Job Description

In this role you'll design and elevate the infrastructure behind high-performance simulation and data-heavy workloads. If you thrive on solving compute challenges at scale and pushing performance boundaries, this is your opportunity to make a measurable impact.You'll support complex engineering environments focused on CFD, FEA, and large-scale modeling, combining advanced Linux expertise, parallel computing techniques, and high-throughput systems to drive scientific and engineering progress.Responsibilities:

  • Design and deploy modern HPC environments, including compute clusters, storage, and interconnects
  • Manage enterprise-grade Linux systems (primarily RHEL and Rocky), with a focus on stability, automation, and uptime
  • Oversee workload orchestration using tools like Slurm, fine-tuned to maximize resource availability
  • Analyze and fine-tune performance of key simulation software
  • Improve parallel processing efficiency using MPI, OpenMP, or CUDA in CPU/GPU environments
  • Use observability platforms such as Grafana to track and respond to system metrics
  • Partner with simulation engineers to adapt the compute environment to evolving workload demands
  • Create knowledge resources, training materials, and hands-on guidance to support end-users
Experience/Skills:
  • Strong grounding in high-performance compute systems and distributed infrastructure
  • Deep experience managing and tuning Linux-based environments in technical computing contexts
  • Solid hands-on experience with resource management and queuing systems (preferably Slurm)
  • Exposure to simulation-driven workloads in engineering or scientific domains
  • Skilled in scripting and automation using Bash and Python for cluster management and diagnostics
  • Strong understanding of parallel execution models and application-level optimization
  • Familiar with high-bandwidth networks (InfiniBand) and distributed storage systems like Lustre or GPFS
  • Excellent problem-solving instincts, user-centric thinking, and communication skills