Leave us your email address and we'll send you all the new jobs according to your preferences.
Site Reliability Engineer
Posted 1 hour 15 minutes ago by Infoplus Technologies UK Ltd
Must have skills:
- Unix Shell Scripting, SQL
- Troubleshooting using logs, Splunk/Dynatrace
- ITSM - Incident, Change and Problem Management
- L2 Support experience is a must
- PCF Cloud knowledge
- CI/CD, Jenkins, Git & Maven
JD as below:
Roles/responsibilities performed by the Biz Ops React team members and tools used in their applications.
Incident Resolution - Review and resolve the Incidents arising from
o Operation Command Center Alerts
o Alerts from Enterprise Monitoring Operations (EM Operations).
o OMNIBUS and Splunk Alerts
Change Implementation - Deploying the application related artifacts to the production environments in the slotted approved release window
Reporting the issues with the deployments and coordinating with the Development Teams to fix any deployment issues
Work Orders - Resolve Work orders in form of Business/functional queries, adhoc testing, verification and validation etc, from Regional product team and customer support teams.
Traffic Routing - perform traffic routing in support of infrastructure maintenance
Perform Root Cause Analysis in detail for High severity Incidents - and take action on fixing the underlying cause of the high severity issues. Take necessary preventive actions also.
Supporting the UAT testing by the Product team and Regional customer support team.
Configuring application/artifacts and supporting the new customer onboarding to the platform
Testing the newly on boarded customer's file processing and reports delivery
Raise new change tickets and arrange for approvals, including CAB approvals
Review and approve change tickets.
Creating Confluence pages for newly analyzed Work Orders/new type of Incidents with resolution steps
Work with customers on ad-hoc queries
Work with Development/Testing team for defect analysis (with Production simulated data)
Build automation scripts that reduce the number of Incidents and/or improves processes followed
Support customer to fill in the Post Incident Report (PIR) when any high impacting Incidents affecting customers occurred.
Participate/Initiate in War Room calls that impacts application availability or has a customer impact
Tools Used:
Remedy - Ticketing Tool
Rally (For Story and Bug Tracking)
Splunk and Dynatrace for Monitoring
WinScp (file movement/validation)
Cyberark/Putty
Toad - Querying Tool for DB
Infoplus Technologies UK Ltd
Related Jobs
Test Technician
- Oxfordshire, Abingdon, United Kingdom, OX136
Part Time Kitchen Assistant
- Surrey, Godalming, United Kingdom, GU7 1
Food Hall Team Member - Webbs, Wychbold
- Worcestershire, Wychbold, United Kingdom, WR9 0
Pizza Delivery Rider - Flexible Hours & Tips
- London, United Kingdom
Parts Advisor
- Wiltshire, Melksham, United Kingdom, SN126