Ai systems engineer – llm execution

Pozuelo

OpenNebula Systems

Publicada el 1 octubre

Descripción

Overview

Join to apply for the AI Systems Engineer – LLM Execution role at OpenNebula Systems .

OpenNebula Systems leads European open source technology to help organizations manage data centers and build Enterprise Clouds. The AI Factory product line delivers sovereign, edge-to-cloud AI infrastructure, enabling deployment, orchestration, and optimization of AI workloads with full control. This role is part of the team developing the AI Factory product line in Europe.

We are hiring for an AI Systems Engineer to design and implement the execution layer powering the AI Factory vision.

Responsibilities
* Design, implement, and optimize LLM inference pipelines for multi-GPU and multi-node environments.
* Integrate with inference engines (e.G., vLLM, TensorRT-LLM, DeepSpeed, etc.).
* Tune execution parameters for latency, throughput, and memory efficiency across heterogeneous infrastructures.
* Coordinate LLM serving with orchestration frameworks such as Ray, NVIDIA NeMo/Dynamo, and others.
* Integrate with LLM catalogs and registries (e.G., HuggingFace, NVIDIA NIM, internal repositories).
* Collaborate with product and platform teams to shape a modular, portable AI Factory execution layer.
* Interact with users to provide systems support, architecture definitions, recommendations, implementation, testing, training, and deployment of open source solutions.
* Troubleshoot incidents, identify root causes, implement fixes, and document preventive measures.
* Deliver quality performance indicators and maintain project documentation (journals, status reports, etc.).
* Engage with international cloud-edge ecosystems and participate in open-source communities;
willingness to travel occasionally.
* Write and maintain software documentation and project reports.
Qualifications
* Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field.
* Strong hands-on experience deploying and optimizing LLMs in production environments.
* Experience with inference frameworks such as vLLM, TensorRT, Triton Inference Server, DeepSpeed-Inference, etc.
* Hands-on experience with orchestration tools like Ray, NVIDIA NeMo/Dynamo, or KServe.
* Experience deploying LLM workloads on hybrid or sovereign cloud environments.
* Contributions to open-source LLM or inference projects.
* Deep knowledge of multi-GPU systems and GPU memory management.
* Solid understanding of distributed systems and networking bottlenecks in model serving.
* Programming experience in Python;
knowledge of CUDA and model quantization is a plus.
* Familiarity with LLM catalogs (e.G., HuggingFace, NGC, NIM) and open-source MLOps or AI workload orchestration platforms.
* Professional English fluency with strong writing and speaking clarity.
Soft Skills & Collaboration
* Strong customer service mindset with a focus on responsiveness and user satisfaction.
* Clear communication and documentation with strong written and verbal English;
asynchronous collaboration.
* Excellent problem-solving and proactive issue resolution.
* Self-management, accountability, and ability to work independently and meet deadlines.
* Technical autonomy with Git, CI/CD, remote collaboration tools (Slack, Zoom, GitHub), and problem-solving without direct supervision.
What’s in it for me?
* Competitive compensation and flexible remuneration options (meals, transport, childcare).
* Customized workstation (macOS, Windows, Linux).
* Private health insurance.
* 6-hour Fridays and August work rhythm.
* Paid time off: holidays, personal time, sick time, parental leave.
* All-remote company with HQ in Madrid and offices in Boston and Brno.
* Healthy work-life balance and support for digital disconnecting.
* Flexible hiring options: full-time or part-time;
employee (Spain/USA) orcontractor (other locations).
* Engineering-first culture with openness, collaboration, risk-taking, and continuous growth.
* Exposure to a broad technology ecosystem with opportunities to learn and research new technologies.
Seniority level
* Mid-Senior level
Employment type
* Full-time
Job function
* Information Technology
Industries
* Software Development
#J-18808-Ljbffr

Enviar

Crear una alerta

Guardar