Applied ai engineer: production-grade llm systems

Sevilla (41007)

Zoolatech

Publicada el 27 abril

Descripción

Government-backed Abu Dhabi organization focused on advanced technology R&D (est. 2020), defining strategy, funding, and policies across AI, robotics, and emerging technologies. Oversees the full innovation lifecycle - from research and programs to commercialization - through dedicated applied research, innovation, and venture entities.The first production system is an AI-enabled operational platform that gives a senior leadership team a shared situational picture, an AI-classified signal feed, a daily AI-generated briefing, and an action accountability tracker. MVP target: operational within two weeks of team formation. The platform is also the technical foundation for all subsequent Data & AI systems across the organization.Build, own, and continuously improve the AI capabilities in the DAIO's(Data & AI Office) production systems: real-time signal classification against a defined scenario framework, and daily AI-generated briefing generation. This is not a research role and not a fine-tuning role. It is applied AI engineering — structured prompts, observable outputs, deterministic fallbacks, and measurable quality. The AI capabilities must work reliably under production conditions including API outages, malformed signal data, and edge-case classification scenarios. This role also designs the migration path from the initial LLM runtime to the sovereign model runtime in Phase 2.WHAT THIS ROLE BUILDS & OWNSAI Classification & Briefing Service — FastAPI wrapper around the LLM API with two versioned prompt templatesSignal classification prompt — structured prompt against a defined scenario taxonomy, returning JSON with scenario tag, confidence level, and rationaleDaily briefing generation prompt — structured 400–600 word output covering signal summary, scenario assessment, delta from prior day, and recommended decision agendaPrompt versioning system — templates stored in configuration, editable by authorized users without code changesObservability layer — every API call logged with input hash, model version, output, latency, and token countFallback logic — graceful degradation when the LLM API is unavailable: items stored as unclassified and surfaced for manual reviewClassification quality evaluation framework — weekly precision measurement against human reviewer samplePhase 2: sovereign model runtime migration plan — prompt adaptation, integration testing, performance benchmarkingKEY DECISIONS THIS ROLE OWNSPrompt design for each capability — structure, temperature, output format, system vs. user message splitConfidence threshold definition — what triggers a low-confidence flag requiring human reviewContext window management for briefing generation — what signal subset to include within the token budgetWhen to trigger prompt iteration vs. accept current classification qualityWhich classification errors are acceptable vs. unacceptable given operational stakesSovereign model prompt adaptation scope for Phase 2 — what needs rewriting, what transfersWHAT THIS ROLE DOES NOT DOBuild the backend API or ingestion pipeline — this role calls the API, it does not build itFine-tune or train models — this is prompt engineering and integration, not ML researchDefine the operational scenario taxonomy — that is business domain knowledge owned by designated ownersOwn the data schema for signals — that is the Head of Data ArchitecturePROFILE OF THE IDEAL CANDIDATEHas shipped an LLM-based feature that non-AI users depend on daily — and has been responsible when it breaks. Knows that the hardest part of applied AI is not the prompt — it is the fallback, the observability, and the human review loop. Can write a classification prompt in the morning, evaluate its precision against a ground truth set in the afternoon, and ship an improved version the next day. Not attached to a particular model — the job is reliable output, not elegant architectureAnthropic Claude API — structured output prompting, JSON mode, system prompt designPrompt engineering for classification tasks — zero-shot and few-shot with examplesPython — async API calls, error handling, retry logic with exponential backoffLLM evaluation — precision/recall for classification, human-AI agreement measurementStructured output design — JSON schema enforcement, output validation with PydanticAPIs (Falcon, Llama, or equivalent)Token budgeting and context window managementObservability for AI systems — output quality monitoring, anomaly detectionFastAPI — building the AI service wrapperDocker deployment of AI service componentsEngagement Model: Direct Independent Contractor (Please read carefully)This is an independent contractor opportunity based on a direct contractual relationship between Zoolatech and the individual service provider.To facilitate this direct partnership, we engage with professionals who are registered and operate as a sole proprietorship, private entrepreneur, or an equivalent self-employment status in your country.Please note, our model does not accommodate contracts through third-party intermediaries such as agencies, incubators, or umbrella companies. The essential requirement is your ability to enter into a service agreement and invoice Zoolatech directly. This is not an offer of direct employmentPlease note that only candidates whose profiles closely match our requirements will be contacted.

#J-18808-Ljbffr

Enviar

Crear una alerta

Guardar