Site Reliability Engineer Job in UK 2026 with Visa Sponsorship Mistral AI
Posted 3 days 18 hours ago by NewsNowGh
Mistral AI is hiring a highly experienced Site Reliability Engineer (SRE) to strengthen the reliability, scalability, and performance of its cutting edge AI platform and customer facing systems. This role is based in London, England, with a strong European presence and flexible arrangements for eligible candidates.
The position is open to international professionals, with Skilled Worker visa sponsorship available, making it an exceptional opportunity for senior engineers seeking to build a long term career in the UK's fast growing artificial intelligence sector. You will join a world class team working at the frontier of open, high performance AI infrastructure.
About RoleAs a Site Reliability Engineer, you will operate at the intersection of software engineering and production operations, balancing day to day reliability with long term platform improvements. The role combines hands on operations with infrastructure and platform engineering, supporting both customer facing services and large scale AI model training environments.
You will work closely with software engineers, security teams, and AI researchers to ensure systems are highly available, secure, reproducible, and scalable across multiple environments and high performance computing clusters.
About Hiring FirmMistral AI is a pioneering AI company focused on democratizing artificial intelligence through high performance, optimised, and open models and platforms. Its products are designed to integrate seamlessly into enterprise and research environments, both on premises and in the cloud. With teams across Europe, the UK, the USA, and Asia, Mistral AI is known for its ego, and innovation driven culture.
The company is building the next generation of AI infrastructure and tools that are already shaping how organisations deploy and use advanced AI systems.
Responsibilities- Design, build, and maintain scalable, highly available, and fault tolerant infrastructure for web services and ML workloads
- Ensure high availability of inference and training environments and enable replication across HPC clusters
- Operate and troubleshoot production systems, including incident response and root cause analysis
- Implement and improve monitoring, alerting, logging, and incident management systems
- Build and maintain CI/CD, containerisation, orchestration, and automation workflows
- Drive infrastructure as code, deployment, and orchestration using tools such as Kubernetes and Terraform
- Collaborate with researchers to enable safe, reproducible model training and experimentation
- Develop new tooling, dashboards, and workflows to improve reliability, performance, and operability
- Work with security teams to ensure compliance with best practices and standards
- Document systems, processes, and contribute to knowledge sharing and open source initiatives
- Master's degree in Computer Science, Engineering, or a related field
- 7+ years of experience in SRE, DevOps, or similar roles in distributed systems environments
- Strong experience with cloud platforms, highly available systems, and reliability engineering practices
- Hands on experience with Docker, Kubernetes, CI/CD pipelines, and infrastructure as code tools
- Proficiency in scripting or programming (e.g., Python, Go, Bash)
- Solid knowledge of observability stacks, networking, security, and system administration
- Excellent problem solving skills and ability to work in fast paced, high impact environments
This is a rare opportunity to join one of Europe's most exciting AI companies with visa sponsorship, strong benefits, and global impact. If you are a senior SRE looking to relocate to the UK or grow your international career while working on world class AI infrastructure, Mistral AI offers a truly exceptional platform for your next career move.