Leave us your email address and we'll send you all the new jobs according to your preferences.
Lead Site Reliability Engineer
Posted 15 days 8 hours ago by Lloyds Banking Group
JOB TITLE: Lead Site Reliability Engineer
LOCATION(S): Edinburgh, Halifax or London
HOURS: Full time
WORKING PATTERN: Hybrid, 40% (or two days) in one of the above offices
About this OpportunityThe Lead SRE is accountable for the reliability, scalability, and performance of cloud infrastructure and platform services supporting Risk Foundations. This role ensures that services meet defined Service Level Objectives (SLOs), manages error budgets, and leads incident and problem management across multiple feature teams! The Lead SRE supports methodologies in SRE and collaborates with product and application teams to integrate reliability engineering into delivery pipelines!
Key ResponsibilitiesReliability & Performance Management: Design, implement and own the SLOs for critical platform services. Monitor system health, manage error budgets, and drive improvements in Mean Time to Failure (MTTF) and Mean Time to Recovery (MTTR).
Incident & Problem Management: Lead incident response and post-mortem analysis. Ensure root cause identification and long-term remediation strategies are implemented.
Platform Advocacy & Collaboration: Champion SRE principles across Risk Foundations Labs. Collaborate with Lab Product Owners, Engineering Leads, and application teams to embed reliability into design and delivery.
Technical Leadership: Provide technical oversight across cloud infrastructure, CI/CD pipelines, observability tooling, and automation frameworks. Guide engineers in adopting scalable and resilient solutions.
Continuous Improvement: Identify and implement improvements in deployment, monitoring, and alerting processes. Drive automation to reduce toil and improve operational efficiency.
Governance & Compliance: Ensure platform services adhere to internal risk, security, and compliance standards. Support audit and regulatory reporting requirements.
About us
If you think all banks are the same, you'd be wrong. We're an innovative, fast-changing business that's shaping finance as a force for good. A bank that's empowering its people to innovate, explore possibilities and grow with purpose.
What you'll need
Proven experience embedding SRE practices within large-scale cloud environments.
Strong understanding of observability, monitoring, and incident response tooling.
Experience with infrastructure-as-code, CI/CD, and cloud-native technologies (e.g., GCP, Azure).
Ability to lead cross-functional teams and influence technical direction.
Familiarity with risk and compliance frameworks in financial services is a plus.
About working for us
Our focus is to ensure we're inclusive every day, building an organisation that reflects modern society and celebrates diversity in all its forms.
We want our people to feel that they belong and can be their best, regardless of background, identity or culture.
We were one of the first major organisations to set goals on diversity in senior roles, create a menopause health package, and a dedicated Working with Cancer initiative.
And it's why we especially welcome applications from under-represented groups.
We're disability confident. So, if you'd like reasonable adjustments to be made to our recruitment processes, just let us know.
We also offer a wide-ranging benefits package, which includes:
A generous pension contribution of up to 15%
An annual bonus award, subject to Group performance
Share schemes including free shares
Benefits you can adapt to your lifestyle, such as discounted shopping
30 days' holiday, with bank holidays on top
A range of wellbeing initiatives and generous parental leave policies
Want to do amazing work, that's interesting and makes a difference to millions of people? Join our journey.