Senior Site Reliabilty Engineer

Posted 2 hours ago by Stratospherec Ltd

£80,000 - £100,000 Annual
Permanent
Full Time
University and College Jobs
Not Specified, United Kingdom
Job Description
Senior Site Reliability Engineer / DevOps Engineer - Full Remote (UK)

Fully remote working for UK citizens based in the UK - Salary to £80k to £100k + Benefits

We are looking for a Senior Site Reliability Engineer / DevOps Engineer who has a background in .NET Software Development and strong C# programming skills, combined with a strong knowledge of DevOps tools like Kubernetes, Docker and Azure Cloud platform. Our client is a global digital SaaS software company with a fantastic fully remote opportunity for an experienced Senior Engineer to join their UK Cloud Infrastructure team.

Senior Site Reliability Engineers at this company are responsible for keeping the SaaS products running properly. Using concepts of software and systems engineering, they work to improve the reliability of all cloud systems while keeping levels of manual work low. SREs are expected to be experienced in software engineering principles, operational discipline, and automation. The SRE team works on a fully remote basis in conjunction with US and Australian teams.

The company is a market leader in student community management software. Its unique SaaS platform is an essential platform in the life of millions of university students across the globe. In this role you will apply your Software Engineering experience to enhance system performance and reliability, as well as build internal systems and capabilities that eliminate manual work through automation. You will be joining their Platform teams with globally dispersed Site Reliability and Platform Engineers in a "follow the sun" model to operate our products on a multi region cloud platform.

Role Responsibilities
  • Provide technical leadership and mentoring within the team through knowledge sharing sessions, pair programming, code reviews and solution design.
  • Identify and implement technical solutions to improve platform reliability, including the creation of mitigation strategies and operational playbooks.
  • Implement and maintain monitoring/alerting/logging systems to identify and respond to incidents.
  • Ensure scalability and efficiency of cloud infrastructure and systems to handle traffic and data growth.
  • Conduct performance testing to identify and remediate bottlenecks.
  • Develop and maintain platform solutions, automate infrastructure provisioning, configuration and management tasks using Infrastructure as Code.
  • Monitor, review and tune databases to ensure high availability and performance.
  • Collaborate with product engineering teams to design and build fit for purpose observability software.
Required Skills and Experience
  • Proven experience in an SRE/DevOps/Platform Engineering role and previously worked in a .NET and C# Software Engineering role.
  • Proficiency in C# development language - alongside knowledge of scripting languages like Bash, Python or PowerShell.
  • Production experience operating containerisation technologies - ideally with Kubernetes and/or Docker.
  • Proficiency with one or more public cloud providers such as Azure, AWS or GCP.
  • Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible or CloudFormation.
  • Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana or similar.
  • Proven track record of maintaining highly available and performant production environments.
  • Ability to identify and implement effective mitigation strategies and operational playbooks.
Useful / Bonus Skills
  • Experience in CI/CD tooling - Azure DevOps/GitHub Actions, Octopus Deploy.
  • Relevant certifications in cloud platforms (e.g., Microsoft Certified: Azure Solutions Architect) and DevOps practices (e.g., Certified Kubernetes Administrator) are a plus.
  • Experience in database management/performance tuning, particularly MSSQL.
Employee Benefits
  • Opportunity to be part of a 30+ year well established, high performance SaaS company.
  • Excellent company pension scheme and life insurance.
  • Excellent holiday allowance.
  • Supportive team environment with an emphasis on learning and development opportunities.
  • Work with a team of caring, high performing and passionate people who enjoy supporting our vision, innovation and continuous improvement.

This SRE/DevOps Engineer role is part of a large program of change and improvement in our Cloud SaaS products over the coming years. If you are looking for an interesting SRE role with a forward thinking global organisation, this would be a tremendous career opportunity to consider.

Please apply with your CV to find out more.