Join an ambitious and fast-growing deep tech environment where you'll shape the infrastructure behind cutting-edge AI models. This is a high-impact opportunity to work at the intersection of large-scale systems engineering and advanced machine learning, driving performance, efficiency, and scalability of next-generation LLMs.
What you will do
* Architect and optimize distributed training pipelines for large-scale AI models using NVIDIA NeMo and related frameworks.
* Design and deploy high-performance inference systems leveraging vLLM, TensorRT-LLM, and SGLang.
* Implement advanced optimization techniques such as continuous batching, quantization (FP8/AWQ), and PagedAttention.
* Orchestrate ML workloads across cloud and on-prem clusters using SLURM, Ray, Flyte, or similar tools.
* Build and maintain end-to-end ML lifecycle systems including experiment tracking, versioning, and deployment workflows.
* Perform deep performance profiling and bottleneck analysis across GPU, networking, and system layers.
* Drive cost optimization strategies for GPU usage and infrastructure scaling.
* Lead technical direction, mentor engineers, and establish best practices across the team.
What you bring
* 5+ years of experience in MLOps, DevOps, or software engineering, including at least 2 years working with LLM infrastructure.
* Strong expertise in PyTorch and NVIDIA ecosystem (CUDA, NCCL, Triton).
* Hands-on experience with distributed training frameworks (e.G., NeMo, Megatron).
* Experience with LLM inference optimization tools such as vLLM, TensorRT-LLM, or SGLang.
* Solid knowledge of Kubernetes and cluster orchestration tools (SLURM, Ray, Flyte, SkyPilot).
* Experience managing ML lifecycle tools (MLflow or similar).
* Strong programming skills in Python, with working knowledge of C++ or Rust.
* Understanding of high-performance systems, GPU optimization, and scalable infrastructure.
* Strong problem-solving mindset with the ability to work across systems and ML domains.
Why to join
* Work on cutting-edge AI infrastructure powering next-generation models.
* Tackle complex challenges in performance, scalability, and efficiency at scale.
* Collaborate with highly skilled engineers and researchers in a deep tech environment.
* Influence technical direction and contribute to core platform innovation.
* Competitive compensation and strong growth opportunities in an emerging AI leader.
Ready to take the next step?
If you're motivated and excited by this opportunity, apply now or email
By applying to this role you understand that we may collect your personal data and store and process it on our systems. For more information please see our Privacy Notice (
In accordance with local employment laws, applicants must have current, valid authorisation to work in Spain at the time of application. We are unable to sponsor employment visas for this role. Applications from individuals without existing work authorisation for Spain cannot be considered.