Senior Site Reliability Engineer

Posted 14 hours 43 minutes ago by Stott and May

Permanent

Full Time

Other

South Glamorgan, United Kingdom

Job Description

Senior Site Reliability Engineer

Start: ASAP

Duration: 6 months +

Location: 3 days per week in central London

Rate: negotiable DoE, inside IR35

We're seeking a skilled Site Reliability Engineer to join a high-performing technology team within a leading professional services environment. You'll design and maintain reliable, secure, and scalable cloud infrastructure while driving automation and DevOps best practices.

Key Responsibilities

Build and manage CI/CD pipelines, release automation, and infrastructure as code (IaC).
Develop resilient cloud environments and optimize monitoring, alerting, and performance.
Maintain observability tools (Prometheus, Grafana, Datadog, Splunk, etc.).
Manage incident response, troubleshooting, and root cause analysis.
Collaborate with teams to improve reliability, scalability, and deployment efficiency.
Document and standardize technical processes and runbooks.

Skills & Experience

4+ years in SRE, DevOps, or related roles.
Advanced Kubernetes (EKS, GKE, AKS, or RKE); proficient with Kubectl and Helm.
Strong containerization (Docker, microservices with Java/Spring Boot).
CI/CD tools: Jenkins, GitHub Actions, Azure DevOps, ArgoCD.
IaC: Terraform or Pulumi (module development preferred).
Observability & monitoring: Prometheus/Grafana, Datadog, OpsGenie, or similar.
Familiar with Git workflows, Python/Go scripting, and security tools (Vault, Qualys, etc.).