Overview As a Site Reliability Engineer, you will solve exciting technical challenges by defining, designing, deploying, and troubleshooting key OHH products, platforms, and infrastructure. With an attitude toward reliability, scalability, telemetry, security, and performance.
Our team works within the Oracle Health Production Operations team with a core focus on the customer aligned system deployments. We are responsible for building and maintaining the products and solutions that enable our customers to operate with greater efficiency, security, and attention to detail. As a member of the team, you will be surrounded by bright and innovative minds thriving in a collaborative environment supporting infrastructure and applications. We empower our team to make advancements to be more efficient and productive in their day-to-day workflows leading to the delivery of a superior customer product availability and support experience.
Responsibilities What You\'ll Do :
Product Ownership – You\'ll join the OHH Production Engineering team responsible for infrastructure ownership of various products, including Windows OS, Unix, Oracle Cloud Infrastructure, WebSphere, DB engines, and Citrix technologies.
Ownership Scope – As an SRE, you\'ll own the end-to-end configuration, dependencies, and behaviour of presentation tier products. You\'ll work with development partners to ensure products are designed for availability, security, reliability, scale, and performance. As the ultimate authority, you\'ll be accountable for performance, telemetry, and automation.
Product Operations – As the Oracle Cloud evolves, you\'ll partner with IP and development teams to shape product architecture improvements. As an SRE, you\'ll be a technical expert, articulating product characteristics and dependencies.
Operations Engineering – You\'ll understand and communicate the scale, capacity, security, performance, and telemetry of your products. As a Subject Matter Expert, you\'ll have in-depth knowledge of every aspect of your product stack. Such as:
Analyse and optimise system performance and resource usage under load
Build metrics and instrumentation to understand product behaviour
Ensure scalability, resilience, and effective disaster recovery
Drive security operations and compliance with corporate standards
Implement improvements through CI/CD pipelines
Provide technical expertise for complex issues and root cause resolution
Prevent recurring problems with long-term fixes
Support day-to-day operations and participate in the on-call rotation
Maintain a deep understanding of product architecture and its impact on distributed systems
Represent the SRE function across teams with strong communication and customer focus
Required experience :
5+ years of experience managing complex IT systems
Fluent in English (C1) — all work is conducted in English
Strong experience with Windows GPO and in-depth administration
Solid networking knowledge
Methodical approach to solving complex problems
Proficiency in coding/scripting languages: Python, JavaScript, Bash, PowerShell
Preferred experience :
Citrix infrastructure and technologies
Experience with DevOps toolchains, including: Configuration management tools: Chef, Ansible/OLAM
Monitoring & observability: Splunk, Zabbix, Grafana
CI/CD pipelines
Kubernetes and container platforms
Oracle Cloud Infrastructure or other private cloud providers
Deep Technical Skills (Bonus) :
Windows Server Roles : Active Directory: Trusts, sites, replication
Group Policy: GPO creation and management, policy templates
Certificates: Template configuration, deployment, renewal
File Servers & Storage: Planning and managing storage environments
Ability to define and document complex, scalable technical architectures
Hybrid working environment – 1-2 days per week in the Barcelona Office
#J-18808-Ljbffr