SRE / Site Reliability Engineer
#127968; Location: fully remote
#128187; Skills: AWS, Kubernetes, EKS, Terraform, high-scale systems, monitoring
Truly career defining roles here for Site Reliability Engineers with one of Europe’s fastest growing tech businesses.
Why join/key responsibilities
High-impact role where you will monitor systems that see incredibly high-traffic daily
Oversee and troubleshoot production systems with focus on uptime, reliability, and performance
Respond to critical incidents as part of on-call rotation and support team in 24/7 operations
Build and optimise monitoring and alerting systems across Kubernetes (EKS)
Proactively build scripts and automated solutions to prevent recurring issues
Develop infrastructure tooling and enhance CI/CD processes
Work cross-functionally with engineering and product teams to ensure seamless updates and minimal user impact
Conduct thorough root cause analyses (RCAs) and postmortems to prevent future incidents
Great salary, lucrative bonus and unlimited holiday
What you’ll bring
Hands-on expertise in Kubernetes (EKS preferred) - from deployment to advanced troubleshooting
Comfortable with 24/7 on-call support for critical events
Strong track record with AWS ecosystem Terraform, Docker, and modern CI/CD pipelines
Proven experience working with observability tools like DataDog, Prometheus, Grafana, and log stacks such as ELK or CloudWatch
Solid scripting skills in Python, Go or NodeJS
Understanding of network protocols
Strong problem-solving mindset and a sense of ownership
A proactive approach to operational excellence
Happy working fully remote in a B2B capacity
--
We make an active choice to be inclusive towards everyone every day. Please let us know if you require any accessibility adjustments through the application or interview process.
Signify’s mission is to empower every person, regardless of their background or circumstances, with an equitable chance to achieve the careers they deserve. Building a diverse future, one placement at a time.