Role Overview
We are seeking a highly experienced Service Engineer L3 to join our service operations team. This role is intended for senior professionals with deep expertise in troubleshooting complex online and distributed systems, combined with strong leadership and operational excellence capabilities.
Shift Requirements
The position requires shift work and on‑call duties as part of a continuous operations model.
Key Responsibilities
- Lead the resolution of complex and high‑impact incidents in distributed and online service environments
- Perform deep‑dive diagnosis and advanced debugging of critical issues
- Act as the escalation point for L1, providing technical guidance and leadership
- Drive and oversee root cause analysis (RCA) and post‑incident reviews across teams
- Identify systemic issues and implement long‑term solutions to improve service reliability
- Design and develop automation solutions to optimise operational efficiency and reduce manual intervention
- Build dashboards to provide visibility into SLA performance, service health, and team workload
- Develop reports to provide insights on technology performance and recommend improvements
- Collaborate with engineering and product teams to influence service design and resilience
- Communicate effectively with stakeholders, including senior leadership, customers, and partners
- Ensure compliance with data protection regulations, including GDPR
Required Skills & Experience
- Strong college hire or 1–2 years of experience in service operations
- For L2 Engineer: 2–4 years of experience diagnosing/debugging faults in complex online services
- For L3 Engineer: 6+ years of experience diagnosing/debugging faults in complex online services
- Demonstrated experience diagnosing/debugging faults in distributed systems
- Proven ability to lead teams while performing hands‑on individual contributor work
- Working knowledge of enterprise network gear including routers, switches, and load balancers
- Working knowledge of enterprise routing protocols and IP subnetting
- Experience using diagnostic tools such as Netmon, WinDBG, and Wireshark
- Advanced experience with scripting using PowerShell, SQL, and Python
- Ability to identify and script automatable problems at scale, with a focus on efficiency and reliability
- Ability to build dashboards for SLA tracking and operational visibility
- Ability to build analytical reports to drive service and technology improvements
- Knowledge of Azure and Microsoft 365 architectural concepts (Azure Portal, Storage Nodes, VMs, etc.)
- Strong understanding of GDPR laws and data protection principles
Core Competencies
- Expert‑level troubleshooting and analytical skills in complex environments
- Strong leadership and mentoring capabilities across operational teams
- Ability to manage and resolve critical incidents under pressure
- Strong communication skills in written and spoken English (fluent level required‑ B2/C1)
- Ability to interact with external customers and partners
- Strong focus on automation, scalability, and continuous improvement
- Ability to execute with precision in high‑impact, time‑sensitive scenarios
- Strategic thinking with a focus on long‑term service reliability and optimisation
- High level of ownership, accountability, and decision‑making
Working Model
- 12x5 service coverage (service coverage from 8:00 AM to 8:00 PM) with rotating shifts
- Participation in on‑call (standby) rotations
- Fully on‑site role (Madrid, Málaga, or Asturias offices)
#J-18808-Ljbffr