Senior Machine Learning Operations Engineer
Location: Remote from Spain (an indefinite Spanish employment contract)
Make retail great again through the power of technology! Intellias helps retailers provide consistent and customer-centric shopping experiences across all channels with disruptive retail tech solutions. Get on board and make your own contribution to the industry!
Project Overview:
Our client is the fastest-growing global manufacturing company. An international corporation with over a hundred years of history, internationally recognized brands and Reduced-Risk Products.
Intellias' mission is to support its strategy and efforts in the Digital and e-commerce space (e-commerce and other apps mobile apps, payment gateways, loyalty system, search engine, employee management, identity management, etc.).
A newly conceptualized Digital Eco System is comprised of a set of capabilities including an online shop & website, linking online & offline, customization & personalization, engagement & membership, digital product & services main differences.
Requirements:
* 5+ years in MLOps/platform architecture or adjacent roles, with shipped AI systems
* Proficient Python and strong software engineering principles
* Deep experience with at least one major cloud (AWS/Azure/GCP) and platform engineering (containers, Kubernetes, IaC such as Terraform)
* Experience in designing and guiding scalable machine learning pipelines for model training, validation, and deployment
* Proven CI/CD design for GenAI/ML (evaluation gates, versioning, canary, rollback) and collaboration with security/governance stakeholders
* Sound judgement selecting RAG/vector and provider stacks based on performance, cost, compliance, and portability
* Agent orchestration frameworks (e.g., LangGraph/Semantic Kernel) and tooling protocols (e.g., MCP)
* Experience operationalizing multi-agent systems (tools/routing/memory/guardrails, human-in-the-loop)
* Process automation and enterprise integrations
* Excellent communication and interpersonal skills to collaborate effectively with cross-functional teams, stakeholders' leadership
* At least Upper-intermediate level of English
Nice to have:
* Master or higher degree in Computer Science, Engineering, or related field
* On-prem LLM deployments; performance and cost tuning with caching and model routing
* AI safety, policy, and compliance experience in sensitive environments
* Public speaking and enablement and building reusable accelerators
* Domain exposure in automotive, retail, manufacturing, healthcare, energy, finance, or telecom
Responsibilities:
* Lead discovery with stakeholders and define adoption roadmaps and reference architectures
* Set lifecycle practices for GenAI (LLMOps)
* Architect retrieval and provider layers (RAG, vector stores, model gateways) with portability, cost, and compliance in mind
* Implement RAG/agent workflows that orchestrate tool-calling, retrieval, and grounded answering
* Enable agentic applications at platform level and define solution patterns and evaluation gates (standardized tools, routing, shared memory, HIL, safe fallbacks) aligned with enterprise integration, security, and cost
* Set standards for ingestion, chunking, embedding, and indexing pipelines; select and tune vector databases for retrieval
* Establish CI/CD, Infrastructure-as-Code, observability, and automated testing
* Define governance and safety guardrails
* Establish environment strategy and promotion paths, and a clear handover plan to client teams
* Package reusable patterns/accelerators, mentor engineers, and support presales and proposals.