QA Engineer for data platform
Location:
Remote from Spain (Spanish employment contract)
Join a transformative data and AI platform initiative aimed at modernizing enterprise-scale capabilities and enabling real-time decision-making. This project delivers a comprehensive roadmap covering AI, MLOps, data governance, and platform scalability, supporting a shift towards data-first operations and intelligent automation.
Requirements:
* Fluent English and clear communication with both technical and non-technical stakeholders.
* 3+ years in data testing and code review across ETL/ELT pipelines and analytical warehouses.
* 1–2 years hands-on with AWS data stack (Airflow/MWAA, Spark on EMR or AWS Glue, strong SQL) and BI validation (QuickSight or similar).
* Solid grasp of data models and transformation validation (schema/contract tests, S2T mapping, reconciliation, CDC/SCD).
* Experience validating data governance controls (RBAC, lineage, data quality scorecards) and working with PII under GDPR-style constraints.
* Familiarity with CI/CD for data (GitHub Actions/Terraform), observability (CloudWatch), and test automation in pipelines.
Responsibilities:
* Design and execute end-to-end validation across layers (source → S3/landing → staging/curated) including schema, constraints, reconciliation, boundary/negative, and CDC/SCD tests.
* Verify pipeline logic and performance in Airflow/MWAA, Spark/EMR/Glue;
add automated checks and monitors (CloudWatch), and integrate tests into CI/CD.
* Validate analytical outputs in QuickSight and via APIs:
drill into segmentations, pass-rate/volume trends, anomaly alerts, and supplier dataset health metrics.
* Test SHAP/insight generation flows and recommendation outputs (including on-demand runs), ensuring accuracy, explainability, and guardrails for risk/compliance.
* Enforce data governance in practice:
lineage checks, RBAC/approvals, audit logging, and quality scorecards;
ensure safe test-data strategies for PII
* Review data requirements, S2T/mapping specs, and domain assumptions for sources to ensure completeness and testability.