Job Title: Senior Data Lead Engineer
Location: Malaga, Spain - Hybrid: 1 Day to Office Every Week
Duration: Permanent
Employment Type: Full-Time
Roles & Responsibilities:
We are seeking a Senior Data Lead Engineer to drive the cloud data, AI and BI transformation of our Global Transaction Banking (GTB) platforms. You will own the data lakehouse, data mesh, and AI enablement roadmap, delivering secure, scalable and business-ready data products across our hybrid (cloud + on-prem) environment.
What you will do
* Lead the Data, AI & BI platform strategy for GTB across AWS and SCIB hybrid architecture
* Design and evolve the cloud data lakehouse (S3, Iceberg/Delta, EMR, Databricks)
* Build domain-oriented data products using data mesh principles (data-as-a-product, SLAs, ownership, contracts)
* Design and operate data ingestion, CDC and event-driven pipelines from GTB operational systems
* Integrate on-premise data lake with AWS cloud, ensuring catalog, lineage, security and governance
* Implement data quality, data rules, normalization and data guardrails
* Deliver analytics-ready datasets and semantic layers for BI, KPIs and dashboards
* Enable AI/ML and GenAI use cases: feature engineering, training, MLflow, RAG, fine-tuning, monitoring
* Provide technical leadership and mentoring across Data, ML and BI engineering teams
* Partner with business, product, risk and technology stakeholders to deliver high-impact data solutions
Required Experience
* 5+ years in Data Engineering, Cloud Data Platforms, AI/ML or Advanced Analytics
* Proven experience designing AWS data lakehouse architectures
* Hands-on with Databricks or EMR (Spark, Delta, MLflow, feature store)
* Strong background in ETL, CDC pipelines and event-driven ingestion
* Experience integrating on-prem + cloud data platforms
* Experience delivering production AI/ML solutions
* Strong involvement in data governance, data quality and data guardrails
* Experience working with BI teams, KPIs and semantic data models
Technical Skills
* AWS: S3, Glue, Lake Formation, EMR
* Databricks: Spark (PySpark/Scala), Delta, MLflow, feature store
* Lakehouse & Data Formats: Parquet, Iceberg / Delta, raw → curated → semantic layers
* Data Engineering: SQL, Python, ETL, CDC, event-driven pipelines
* Data Governance: lineage, catalog, data quality, observability, data rules
* BI Enablement: star schemas, semantic layers, KPI modelling
* AI/ML & GenAI: feature engineering, training, RAG, fine-tuning, LLM guardrails
* Hybrid Architecture: on-prem + AWS aligned with CDAIO standards
Nice to Have
* AWS or Databricks certifications
* Experience with CI/CD, orchestration, IaC for data platforms
* BI tools: QuickSight, Power BI, Qlik
* Agile tools: JIRA, Confluence