Leave us your email address and we'll send you all the new jobs according to your preferences.

Lead Site Reliability Engineer

Posted 20 hours 34 minutes ago by Dormont Manufacturing Co

Permanent
Not Specified
Other
Midlothian, Edinburgh, United Kingdom, EH120
Job Description
Role Purpose

The Site Reliability Engineer will work closely with Application, Infrastructure, and Network Engineering teams to ensure the reliability, scalability, and performance of FNZ platforms. This role focuses on deploying, integrating, and providing ongoing operational support for mission critical systems, leveraging modern automation and cloud native practices.

Key Responsibilities
  • Maintain high availability and performance of FNZ platforms.
  • Implement monitoring, alerting, and observability solutions to proactively detect and resolve issues.
  • Collaborate with engineering teams to design and implement robust deployment pipelines.
  • Ensure smooth integration of applications with infrastructure and network components.
  • Use Terraform for provisioning and managing infrastructure across environments.
  • Operate and optimize workloads onprem and public cloud.
  • Manage and troubleshoot application delivery networks, load balancing, and traffic routing.
  • Configure and support F5 Distributed Cloud or similar CDN/ADC technologies.
  • Participate in on call rotations, perform root cause analysis, and implement preventive measures.
  • Work cross functionally with Application, Infrastructure, and Network Engineering teams to deliver reliable services.
Required Skills & Experience
  • Kubernetes (K8s): Deep understanding of container orchestration and cluster management.
  • Terraform: Strong experience in Infrastructure as Code for cloud and on prem environments.
  • Public Cloud: Hands on experience with AWS, Azure, or GCP.
  • F5 Distributed Cloud or Similar: Knowledge of CDN/ADC platforms and their integration.
  • Networking Fundamentals: Expertise in application delivery networks, load balancing, traffic routing, and troubleshooting.
  • Observability Tools: Familiarity with Splunk, NewRelic, or similar.
  • Scripting & Automation: Proficiency in Terraform, Bash, or similar languages.
Desirable Skills
  • Experience with CI/CD pipelines and GitOps workflows.
  • Knowledge of SRE principles.
  • Familiarity with security best practices.
Key Attributes
  • Strong problem solving and troubleshooting skills.
  • Ability to work collaboratively across multiple teams.
  • Passion for automation and reducing operational toil.
Reporting Line

Reports to: Head of Platform Operations/Application Engineering.
Works closely with Application Engineering, Infrastructure Engineering, Network Engineering teams.

Email this Job