Leave us your email address and we'll send you all the new jobs according to your preferences.

Lead Site Reliability Engineer

Posted 15 days 8 hours ago by Lloyds Banking Group

Permanent

Full Time

Other

Edinburgh, United Kingdom

Job Description

JOB TITLE: Lead Site Reliability Engineer

LOCATION(S): Edinburgh, Halifax or London

HOURS: Full time

WORKING PATTERN: Hybrid, 40% (or two days) in one of the above offices

About this Opportunity

The Lead SRE is accountable for the reliability, scalability, and performance of cloud infrastructure and platform services supporting Risk Foundations. This role ensures that services meet defined Service Level Objectives (SLOs), manages error budgets, and leads incident and problem management across multiple feature teams! The Lead SRE supports methodologies in SRE and collaborates with product and application teams to integrate reliability engineering into delivery pipelines!

Key Responsibilities

Reliability & Performance Management: Design, implement and own the SLOs for critical platform services. Monitor system health, manage error budgets, and drive improvements in Mean Time to Failure (MTTF) and Mean Time to Recovery (MTTR).
Incident & Problem Management: Lead incident response and post-mortem analysis. Ensure root cause identification and long-term remediation strategies are implemented.
Platform Advocacy & Collaboration: Champion SRE principles across Risk Foundations Labs. Collaborate with Lab Product Owners, Engineering Leads, and application teams to embed reliability into design and delivery.
Technical Leadership: Provide technical oversight across cloud infrastructure, CI/CD pipelines, observability tooling, and automation frameworks. Guide engineers in adopting scalable and resilient solutions.
Continuous Improvement: Identify and implement improvements in deployment, monitoring, and alerting processes. Drive automation to reduce toil and improve operational efficiency.
Governance & Compliance: Ensure platform services adhere to internal risk, security, and compliance standards. Support audit and regulatory reporting requirements.

About us

If you think all banks are the same, you'd be wrong. We're an innovative, fast-changing business that's shaping finance as a force for good. A bank that's empowering its people to innovate, explore possibilities and grow with purpose.

What you'll need

Proven experience embedding SRE practices within large-scale cloud environments.
Strong understanding of observability, monitoring, and incident response tooling.
Experience with infrastructure-as-code, CI/CD, and cloud-native technologies (e.g., GCP, Azure).
Ability to lead cross-functional teams and influence technical direction.
Familiarity with risk and compliance frameworks in financial services is a plus.

About working for us

Our focus is to ensure we're inclusive every day, building an organisation that reflects modern society and celebrates diversity in all its forms.

We want our people to feel that they belong and can be their best, regardless of background, identity or culture.

We were one of the first major organisations to set goals on diversity in senior roles, create a menopause health package, and a dedicated Working with Cancer initiative.

And it's why we especially welcome applications from under-represented groups.

We're disability confident. So, if you'd like reasonable adjustments to be made to our recruitment processes, just let us know.

We also offer a wide-ranging benefits package, which includes: