HPC Engineer
Posted 18 hours 9 minutes ago by Arcus Search
Permanent
Not Specified
Other
England, United Kingdom
Job Description
In this role you'll design and elevate the infrastructure behind high-performance simulation and data-heavy workloads. If you thrive on solving compute challenges at scale and pushing performance boundaries, this is your opportunity to make a measurable impact.You'll support complex engineering environments focused on CFD, FEA, and large-scale modeling, combining advanced Linux expertise, parallel computing techniques, and high-throughput systems to drive scientific and engineering progress.Responsibilities:
- Design and deploy modern HPC environments, including compute clusters, storage, and interconnects
- Manage enterprise-grade Linux systems (primarily RHEL and Rocky), with a focus on stability, automation, and uptime
- Oversee workload orchestration using tools like Slurm, fine-tuned to maximize resource availability
- Analyze and fine-tune performance of key simulation software
- Improve parallel processing efficiency using MPI, OpenMP, or CUDA in CPU/GPU environments
- Use observability platforms such as Grafana to track and respond to system metrics
- Partner with simulation engineers to adapt the compute environment to evolving workload demands
- Create knowledge resources, training materials, and hands-on guidance to support end-users
- Strong grounding in high-performance compute systems and distributed infrastructure
- Deep experience managing and tuning Linux-based environments in technical computing contexts
- Solid hands-on experience with resource management and queuing systems (preferably Slurm)
- Exposure to simulation-driven workloads in engineering or scientific domains
- Skilled in scripting and automation using Bash and Python for cluster management and diagnostics
- Strong understanding of parallel execution models and application-level optimization
- Familiar with high-bandwidth networks (InfiniBand) and distributed storage systems like Lustre or GPFS
- Excellent problem-solving instincts, user-centric thinking, and communication skills