Experteer Overview
Los candidatos deben tomarse el tiempo de leer atentamente todos los elementos de este anuncio de empleo. Por favor, envíen su solicitud sin demora.
In this role you strengthen the reliability and resilience of our payment systems. You will lead initiatives to monitor, log, and recover from incidents, working with cross-regional on-call rotations. You’ll analyze production issues to drive best practices and design scalable infrastructure, including IDCs and data protection plans. You will collaborate with engineering teams to ensure high availability and secure, compliant operations. This is a mission-critical role with clear impact on system stability and customer trust.
Compensaciones / Beneficios
• Lead reliability initiatives for payment systems, including monitoring, logging, dashboards, and disaster recovery planning across regions
• Handle incident response, drills, contingency planning, and on-call participation to ensure rapid production issue resolution
• Analyze production issues to identify bottlenecks and drive improvements for a highly available payment architecture
• Architect and deploy infrastructure solutions, including new IDCs, and implement data protection plans meeting compliance and security standards
Responsabilidades
• Solid CS foundations with knowledge of Unix/Linux, xpzdshu storage, and networking principles
• Proficiency in at least one programming language (Java/Python/Shell) with experience building ops/maintenance tools
• Strong problem-solving, communication, and ownership mindset
• Experience with cloud platforms and relevant ecosystems (GCP/OCI) is a plus; knowledge of OLAP platforms and native services is beneficial
Requisitos principales
•