Azure Principal MLOps Engineer with Cloud practical experience
Location: Remote from Spain (Spanish employment contract)
Client as a fast-growing tech startup offering an AI-powered platform that automates trade compliance and ESG processes. It helps businesses reduce risks, cut costs, and streamline documentation for general trade operations across various industries.
Join a fast-scaling tech company on a greenfield AI infrastructure project. You'll play a foundational role in building, from scratch, the entire MLOps environment for a highly automated Azure-based AI platform used for large language model operations.
Requirements:
- 5+ years of experience as ML Engineer with solid hands-on experience with MLOps in cloud environments (ideally Azure)
- Proven track record implementing infrastructure with AKS and containerized workloads.
- Experience with Argo CD and Argo Workflows or similar GitOps tools.
- Proficiency in setting up CI/CD pipelines and IaC(Terraform, Bicep or ARM templates).
- Strong experience with MLFlow for tracking experiments and managing ML artifacts.
- Skilled in Python for automation, scripting, and integration tasks.
- Familiarity with PostgreSQL, especially with vector extensions (pgvector is a plus).
- Experience with Azure Redis Cache and Azure Blob Storage.
- Strong knowledge of cloud security best practices (secret management, SSO, encryption)
- Detail-oriented with strong documentation and collaboration skills.
Will be a plus:
- Hands-on experience integrating OpenAI APIs or working with LLMs.
- Familiarity with Azure DevOps, Microsoft Defender for Cloud, and web application firewalls.
- Knowledge of Traefik or similar reverse proxy solutions.
- Background in prompt observability or prompt engineering.
- Experience with advanced monitoring tools for AI pipelines.
- Certifications in Azure, Kubernetes, or MLOps-related fields.
Technology stack:
- Cloud & Infrastructure (AKS, Azure Virtual Network, Azure Load Balancer, Azure DNS, Azure Key Vault, AAD)
- CI/CD & Automation (Argo CD, Argo Workflows, Terraform, Bicep, Azure-native tools)
- MLOps & AI Tools (MLFlow, pgvector, Azure Cache for Redis, Azure Blob Storage, OpenAI API)
- Development & Containerization (Helm, Docker, Python)
- Monitoring & Observability (MLFlow, Azure Monitor, Grafana)
Responsibilities:
- Implement and deploy Azure-based MLOps infrastructure according to the established architecture.
- Set up and manage Kubernetes (AKS), Argo CD, Argo Workflows, PostgreSQL with pgvector, MLFlow, Redis, and Blob Storage components.
- Build CI/CD pipelines using IaC tools, enabling automated deployments and scaling.
- Integrate MLFlow for experiment tracking, prompt management, and observability.
- Implement secure secret management, database encryption, and SSO with Azure Active Directory.
- Migrate and refactor existing repositories into modular services, following the planned structure.
- Configure monitoring tools for AI model performance, prompt usage, and infrastructure health.
- Collaborate closely with existing team to ensure smooth deployment and operations.
- Document processes, configurations, and operational best practices for long-term maintainability.
Why this position:
- This is a hands-on role where you will implement a modern, well-thought-out architecture that is ready to go.
- You will work with the latest tools in cloud, MLOps (Azure, Kubernetes, MLFlow, vector databases, and more)
- Your work will have a clear, visible impact - everything you build will directly support AI features that go into production.
- You will have a lot of freedom to shape how things get done, while still having a clear roadmap to follow.