Senior Site reliability Engineer - Selby Jennings
  • England,London,City of London
  • Full Time, Permanent
  • Competitive salary
Job Description:
A industry leading global investment firm is seeking a Senior Site Reliability Engineer to join its core engineering platform team. This is a high-impact role where you will help shape the reliability, observability, and operational excellence of a rapidly scaling technology environment that underpins cutting-edge research and trading systems.
If you are passionate about building resilient platforms, automating everything, and influencing engineering standards across an organisation, this role offers exceptional scope and technical challenge.
As a Senior SRE, you will:
*Lead the development and evolution of the firm’s observability stack, ensuring high-quality metrics, alert fidelity, and scalable system health monitoring.
*Build reliable, low-noise dashboards and alerting using modern tooling across metrics and logs.
*Improve incident detection, response, and post-incident processes through automation, configuration improvements, and engineering changes.
*Define and apply SLIs/SLOs to support operational and strategic decision-making.
*Enhance reliability, scalability, and operability of core services through hands-on development work.
*Reduce manual operational tasks by identifying recurring issues and implementing automation.
*Apply Infrastructure as Code principles across observability and platform components.
*Develop tooling and automation primarily in Go (preferred) or Python.
*Shape engineering standards by introducing best-practice patterns, documentation, and platform defaults.
*Collaborate with service-owning teams to deliver measurable, sustained platform reliability improvements.
What You’ll Bring
*Strong, practical SRE and SWE experience within production environments.
*Hands-on experience operating containerised workloads (Docker or Podman).
*Essential development experience in Go (preferred) or Python.
*Experience with Grafana (dashboards and alerting).
*Strong Infrastructure-as-Code experience across Terraform and/or Ansible.
*Familiarity with OpenTelemetry: metrics, logs, and tracing.
*Kubernetes and cloud-native engineering experience.
*Exposure to datacentre compute platforms and hardware-backed services.
*AWS configuration and deployment experience.
Job number 3463556

Increase your exposure to recruiters with ProJobs

Thousands of recruiters are looking for you in the Job Master profile database, increase your exposure 4 times with a ProJob subscription

You can cancel your subscription at any time.
metapel
Company Details:
eFinancialCareers
Company size:
Industry:
The jobs on site are for both men and women