Infrastructure Engineer - Compute New London, United Kingdom
Posted 5 hours 23 minutes ago by Nscale Ltd.
Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.
At Nscale, our Engineering team plays a critical role in driving the deployment and subsequent management of our infrastructure and software platforms.
We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you'll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you'll be contributing to building the technology that powers the future.
About the Role (Job Purpose)Infrastructure Engineers at Nscale sit inside the Operational engineering team. The Operational engineering team is responsible for the design, implementation, operation, and continuous improvement of the infrastructure stack that underpins all internal and customer-facing services. This includes all components below the hypervisor, with a strong focus on OpenStack, storage systems, Proxmox, and critical supporting services such as DNS, DHCP, and infrastructure automation. This team ensures high levels of availability, scalability, automation, and security for the infrastructure layers they own. This team acts as a 3/4th line escalation point for support organisations, as well as providing subject matter expertise to pre-sales and other groups within the organisation.
What You'll be Doing (Responsibilities)- Designing, implementing, and operating scalable and resilient infrastructure platforms, with a strong focus on OpenStack, Proxmox, Ceph, and supporting critical services such as DNS, DHCP, and configuration management.
- Continuously improving automation for provisioning, monitoring, patching, and recovery using infrastructure-as-code and configuration management tools.
- Collaborating with internal teams to ensure infrastructure solutions meet performance, availability, and security requirements.
- Acting as a 3rd/4th line escalation point for complex infrastructure issues, and working closely with support teams to resolve problems and identify root causes.
- Contributing to infrastructure roadmap planning, including capacity management, performance tuning, and introducing new technologies.
- Supporting pre sales and solution design efforts by providing technical expertise on infrastructure capabilities and best practices.
- Ensuring all infrastructure platforms adhere to compliance, security, and operational standards.
- Participating in on-call rotations and incident response activities for critical infrastructure services.
- Expert level experience with Linux systems administration
- Strong experience of designing and building automation of both physical and virtual infrastructure using tools like Ansible.
- 5+ years of experience scripting in Python or Bash
- Strong experience of deploying, managing, upgrading and operating large OpenStack clusters.
- Strong experience of deploying managing automating Proxmox
- Extensive troubleshooting experience of linux and services running on linux
- Understanding of datacenter operational best practices for power, cooling, and high-density compute.
- Ability to collaborate across global engineering and operations teams.
- Knowledge of Ironic
- Knowledge of Neutron/OVN/OVS
At Nscale, you'll find a collaborative, supportive, and innovative environment where your contributions spark real impact. We're building something extraordinary, and we want you at the core.
- Highly competitive package (base + equity) with reviews every 12 months.
- Join the fastest-growing tech startup, your chance to push boundaries, collaborate with brilliant minds, and make your mark on cutting-edge AI.
- Expect a dynamic progression plan tailored to your ambitions. Grow by trying new things, leading, challenging the status quo, and owning your impact, always with our full support.
- Human-First Flexibility: We treat you as humans first. Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.
We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.
If there's anything we can do to accommodate your specific situation, please let us know.
The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.