Senior Machine Learning Engineer
Perfil buscado (Hombre/Mujer):
* Design and implement strategies for creating, sourcing, and augmenting datasets tailored for LLM training and fine‑tuning.
* Develop scalable pipelines to collect, clean, filter, annotate, and validate large volumes of text data, ensuring quality and ethical compliance.
* Collaborate with ML engineers, researchers, and software engineers to achieve ambitious goals in the preparation of LLMs and complementary work such as dataset preparation, model evaluation, and model serving.
* Develop and integrate new routines for modifying and enhancing LLMs, extending their functionality.
* Make effective use of distributed compute resources and clusters (GPUs), identifying opportunities for further optimization.
* Lead the end‑to‑end preparation of compressed and specialized LLMs for use in production.
* Stay current with research trends in LLM foundation models, dataset curation, pre‑training data, and benchmarking.
* Contribute to documentation, development standards, and maintain a healthy shared code base.
* Mentor other engineers and share knowledge of cutting‑edge techniques.
You will join a European deep‑tech leader in quantum and AI, in a hybrid role based in Zaragoza.
Qualifications
* Masters, or Ph.D. in Computer Science, AI, Data Science, Physics, Math, or a related field (or equivalent industry experience).
* 4+ years of experience in data science, machine learning, or related roles, with demonstrated experience with NLP or LLMs.
* In‑depth knowledge of large foundational model architectures (language and multimodal models) and their lifecycle: training, fine‑tuning, alignment, and evaluation.
* Proficiency in Python and data tooling ecosystems (Pandas, NumPy, Hugging Face Datasets, Transformers libraries).
* Hands‑on experience with text data collection from diverse sources: web scraping, APIs, proprietary corpora, etc.
* Strong understanding of data quality metrics including bias detection, toxicity, and readability.
* Experience in large shared distributed computing environments and familiarity with tools for hardware optimization (vLLM, TensorRT, NeMo, etc.).
* Experience with version control (git), unit testing, and core software development practices.
* Fluency in English and Spanish.
Compensation & Benefits
* Competitive salary.
* Two unique bonuses: signing and retention.
* Fixed‑term contract with possibility of becoming permanent.
* Hybrid role and flexible working hours.
* Opportunity to be part of an organization focused on technology innovation.
Keywords: Machine Learning, LLM, Python, Pandas, NumPy, Hugging
#J-18808-Ljbffr