The PositionThe position The 21st century needs a 21st century healthcare system. To help build this, Roche is not only developing highly personalized medicine and advanced diagnostics, but also heavily investing into software and digital solutions. To speed up medical processes, make them safer and more accessible to a wide range of people.The team you're joining is working on solving Kubernetes operations at a large scale. Ultimately aiming at deploying and managing thousands, even tens of thousands of Kubernetes clusters around the world. To get there, we not only have to fully automate all processes around that orchestration, but also focus on great developer experience for our users.If you are excited to shape the way healthcare is working in the coming decade, we would love to welcome you in our highly skilled team which strives to become a role model for the entire industry.Who we are looking forWith this position we are looking for a skilled engineer who finds deep joy in building the most stable production system possible. Your work will not only include monitoring the uptime of our platform and being the first responder for issues, but also to design and hands on implementation the observability stack that is required to do this.For that, you already bring a few years of experience in developing and operating large scale software systems, with your more recent work focusing on Kubernetes clusters and CNCF observability stacks.Daily Business things we need you to do "every day" ensuring that our production systems are available and stable for our users, mostly by:Designing and implementing the observability stack on KubernetesMonitoring production systems for instabilities reacting to alertsDefining and testing disaster recovery processesDocumenting your work in the form of playbooks or by giving feedback to the engineering team in form of improvement proposalsMentoring other engineers in the team on SRE best practices by participating on design proposals (RFCs) or merge request reviewsThings we care aboutEngineering first mentalityDoing meaningful workA result oriented mindsetQuality of our serviceContinuous improvement and learningTeamwork and the things that you care about!It's a plus, if you bring experience with (otherwise, we'll teach you)KubernetesPublic clouds like AWSPrometheus / Thanos / Grafana / Open TelemetryTools you already know or would like to learn:HashiCorp Vault, FluxCD, Kustomize, Terraform, GitLab, SlackThe position is located in Sant Cugat (Barcelona), Spain.We're a distributed team (Canada, Spain, Switzerland, China) but also highly value weekly face to face time in the office. Not only for the spontaneous (technical or off topic) kitchen gatherings but also to leverage and build the great team spirit and close collaboration that is needed to achieve what we set out for.We strongly believe that our capability to regularly meet in real life is not a bug, but a feature! Besi