Job Description: Senior MLOps Engineer
Location: Remote from Spain (Spanish employment contract)
We are seeking an experienced MLOps Engineer with expertise in Google Cloud Platform (GCP) to design, build, and optimize end-to-end AI, ML, and data engineering pipelines. This role involves deploying machine learning models, LLMs, and traditional AI models, as well as managing data processing workflows in a GCP-first environment.
The ideal candidate will have experience working with Google Kubernetes Engine (GKE), Apache Spark, Dataproc, Terraform, Vertex AI, and Airflow (Cloud Composer) to ensure scalable and efficient AI/ML operations. While Amazon Web Services (AWS) experience is a plus, it is not required.
Requirements:
* 4-year degree preferred relevant experience will be considered
* 3+ years of MLOps/DevOps/Data Engineering experience, with expertise in Google Cloud Platform (Vertex AI, Dataproc, BigQuery, Cloud Functions, Cloud Composer, GKE).
* Hands-on experience building AI/ML pipelines and data engineering workflows using Apache Airflow (Cloud Composer), Spark, Databricks, and distributed data processing frameworks.
* Experience working with LLMs and traditional AI/ML models, including fine-tuning, inference optimization, quantization, and serving.
* Proficiency in CI/CD for ML, version control (Git), and workflow orchestration (Airflow, Kubeflow, MLflow).
* Strong experience with Terraform for infrastructure automation.
* Strong knowledge of Apigee for deploying, managing, and securing machine learning APIs at scale.
* Production-ready AI/ML solutions: Proven ability to build, deploy, and maintain AI modelsin real-world production environments.
* Programming Skills: Proficiency in Python and familiarity with Bash, Scala, or Terraform scripting.
* Experience with security best practices for ML models, including IAM, data encryption, and model governance.
Bonus Qualifications/Experience:
* Experience with multi-cloud AI/ML solutions.
* Familiarity with AWS AI/ML services (SageMaker, EMR, Lambda, EKS, DynamoDB).
* Knowledge of Feature Stores (Feast, Vertex AI Feature Store, AWS Feature Store).
* Understanding of AIOps and ML observability tools.
* Experience with real-time AI inference pipelines and low-latency model serving.
* Gitlab CI/CD with focus on CI/CD for GCP deployments
* Experience working with PHI/PII in HIPAA and/or GDPR compliant environments
Responsibilities:
* Build, deploy, and automate AI and ML pipelines on Google Cloud Platform (GCP) using tools such as Vertex AI, BigQuery, Dataproc, Cloud Functions, and GKE.
* Deploy, optimize, and scale Large Language Models (LLMs) and other AI/ML models using platforms like Hugging Face Transformers, OpenAI API, Google Gemini, Meta Llama, TensorFlow, and PyTorch.
* Design and manage data ingestion, transformation, and processing workflows using Apache Airflow (Cloud Composer), Spark, Databricks, and ETL pipelines.
* Deploy AI/ML models and data services using Docker, Kubernetes (GKE), Helm, and serverless architectures including Cloud Run.
* Automate and manage ML/AI deployments using Infrastructure as Code tools such as Terraform and CI/CD pipelines with GitHub Actions or GitLab.
* Develop scalable, fault-tolerant ML pipelines to train, deploy, and monitor models in production environments.
* Deploy AI models using TensorFlow Serving, TorchServe, FastAPI, Flask, and GCP-native serverless technologies like Cloud Run.
* Implement monitoring, drift detection, and performance tracking for AI/ML models using MLflow, Prometheus, Grafana, and Vertex AI Model Monitoring.
* Ensure security, governance, access control, and compliance best practices across AI and ML workflows.
* Design cloud-native architectures with GCP as the core platform, utilizing its AI/ML and data engineering tools.