Experteer Overview
In this role, you drive production-ready AI across a large digital-services ecosystem. You set technical direction for multi-agent AI systems, oversee model development and deployment, and ensure observability and reliability at scale. You collaborate with cross-functional teams to translate business problems into ML solutions, shaping the AI platform used by millions of SMBs. This is a hands-on leadership role that blends research, engineering, and production responsibilities to deliver real impact.
Compensaciones / Incentivos
• Architect and evolve a multi-agent orchestration platform (Hermes/Multica) with plugin systems and observability hooks
• Design voice AI pipelines with low latency end-to-end targets and telephony integration
• Build and maintain RAG pipelines with quality measurement over vector and keyword indexes
• Define MCP server architecture and tool-use contracts for internal and external integrations
• Fine-tune and evaluate LLMs (LoRA, QLoRA, DPO) for domain-specific tasks; manage model lifecycle
• Own AI observability stack (Langfuse tracing, LLM instrumentation, cost tracking, quality alerts) and enforce guardrails (PII redaction, safety scanning)
• Develop data ingestion, preprocessing and feature pipelines; drive ML CI/CD with automated eval gating and canary releases
• Set architectural standards, conduct design reviews, mentor engineers, and collaborate with Product to translate business problems into ML problems
• Engage with external research partners to identify production-ready signals and open-source opportunities
Responsabilidades
• 8+ years in ML Engineering, Applied AI, or Research Engineering with leadership experience
• Deep production experience with LLMs: fine-tuning, RLHF/DPO, prompt engineering, RAG, tool use
• Proficiency in Python and core ML stack: PyTorch, Transformers (HuggingFace), PEFT/LoRA
• Hands-on experience with LLM inference serving in latency-sensitive environments (vLLM, TensorRT-LLM, TGI)
• Practical knowledge of agentic frameworks: multi-agent coordination, tool-use orchestration, observability
• Experience with speech AI or real-time audio systems is a strong plus
• Solid MLOps knowledge: experiment tracking (MLflow/Wu0026B), model registries, Docker/Kubernetes, ML CI/CD
• Awareness of LLM risks (hallucination, data leakage, privacy) and mitigation strategies
• Strong communication skills for design docs, architecture reviews, and stakeholder explainability
Requisitos principales
•