Role OverviewCollaborating with a leading AI lab to contract experienced professionals for an AI model evaluation project. Contractors will assess the quality, accuracy, and safety of AI-generated responses across specialized domains such as finance, law, medicine, and accounting. The project offers an opportunity to directly improve the reliability of AI systems in high‐stakes contexts where inaccurate information carries serious risk.
Key Responsibilities
Write realistic prompts that reflect how professionals and consumers seek domain‐specific guidance.
Evaluate AI‐generated responses for factual accuracy, regulatory or clinical correctness, and practical usefulness.
Identify fabricated claims, incorrect references, or misleading reasoning across model outputs.
Score and rank multiple model responses using structured rubrics across dimensions.
Provide written justifications with specific evidence for each evaluation.
Ideal Qualifications
Master's degree or higher in a relevant professional field (e.g., Finance, Accounting, Law, Medicine, Healthcare, Engineering).
Professional experience applying domain expertise in a practitioner or advisory capacity.
Familiarity with industry‐specific standards, regulations, or clinical guidelines.
Strong written communication and critical reasoning skills.
More About The Opportunity
Expected commitment: ~20 hours/week.
Application Process
Complete the platform application and required screening assessments.
Complete a training assessment.
Contract and Payment Terms
You will be engaged as an independent contractor.
This is a fully remote role that can be completed on your own schedule.
Projects can be extended, shortened, or concluded early depending on needs and performance.
Payments will be issued according to the payment method and schedule outlined in the contract.
Skills: social media, context, prompt writing, finance, training, medicine, accounting, annotation.
#J-18808-Ljbffr