Site Reliability Engineer (Python)
Posted 11 days 11 hours ago by La Fosse Associates
Site Reliability Engineer
£70,000 pa
Hertfordshire
My client, a leading entertainment group, is looking for a mid-level SRE to join their platform team in their Hertfordshire office.
In this role, you'll take ownership of the end-to-end monitoring and alerting stack, designing and maintaining infrastructure and alert configurations (e.g., with Prometheus/Grafana or equivalent), and building dashboards that clearly communicate metrics to business stakeholders. You'll drive system automation and integration, crafting scripts and workflows-primarily in Python-to onboard new services, schedule jobs via Control M, and bridge systems using custom automation frameworks. On the CI/CD front, you'll be responsible for authoring and refining GitHub Actions pipelines to automate deployments into test and production environments. Finally, you'll maintain a holistic view of the infrastructure and gaming platform stack.
You will have the following:
- Experience with application monitoring tools e.g., AppDynamics.
- Working knowledge of AWS and the use of Terraform and Ansible.
- Strong experience with Python.
- Hands-on experience with CI/CD tools.
- Strong understanding of network and web technologies.
- Experience with Jira.