Job ID: | Amazon Development Center U.S., Inc. Amazon Quick Suite is an enterprise AI platform that transforms how organizations work with their data and knowledge. Combining generative AI-powered search, deep research capabilities, intelligent agents and automations, and comprehensive business intelligence, Quick Suite serves tens of thousands of users. Our platform processes thousands of queries monthly, helping teams make faster, data-driven decisions while maintaining enterprise-grade security and governance. From natural language interactions with complex datasets to automated workflows and custom AI agents, Quick Suite is redefining workplace productivity at unprecedented scale. Key Job Responsibilities Leverage data-centric AI principles to assess the impact of data on model performance and the broader machine learning pipeline. Apply Generative AI techniques to evaluate how well our data represents human language and conduct experiments to measure downstream interactions. Design and develop comprehensive evaluation and benchmarking datasets for Quick Suite AI-powered features. Leverage LLMs for synthetic data corpora generation and data evaluation and quality assessment using LLM-as-a-judge settings. Create ground truth datasets with high-quality question-answer pairs across diverse domains and use cases. Lead human annotation initiatives and model evaluation audits to ensure data quality and relevance. Develop and refine annotation guidelines and quality frameworks for evaluation tasks. Conduct statistical analysis to measure model performance, identify failure patterns, and guide improvement strategies. Collaborate with ML scientists and engineers to translate evaluation insights into actionable product improvements. Build scalable data pipelines and tools to support continuous evaluation and benchmarking efforts. Contribute to Responsible AI initiatives by developing safety and fairness evaluation datasets. Basi