Scientific Knowledge Engineering – Product Manager
Para presentar una candidatura, simplemente lea la siguiente descripción del puesto y asegúrese de adjuntar los documentos pertinentes.
The Scientific Knowledge Engineering team is responsible for data modeling, ontology definition and management, vocabulary mapping, and other key metadata activities that ensure platforms and data assets speak the language of science. The team is a core contributor to the delivery of a large-scale R&D Knowledge Graph—the semantic layer that connects data and metadata systems—as well as the core metadata experiences that enable the creation of products and services that delight customers and support advanced automation and intelligence.
This role is responsible for maximizing the value of data assets over their lifecycle by bringing purpose to data. The individual acts as a translator of highly technical information from domain experts into robust data models—supported by ontologies and controlled vocabularies—that can be used to effectively structure, index, and integrate data. The role works closely with product managers and R&D subject matter experts to define scientific language (data models, ontologies, and standards) for data products, acting as the voice of knowledgebase interoperability and long‑term asset value.
Key Responsibilities
Define schemas, ontologies, and data models for scientific information required to create value‑adding data products. This includes accountability for quality control and for mapping specifications that are industrialized by data engineering and maintained in platform‑provisioned tooling.
Own quality control processes (validation and verification) for mapping specifications, including models, schemas, and controlled vocabularies.
Partner with product managers and engineers to translate business needs into well‑defined deliverable requirements that enable integration of large‑scale biological data to predict, model, and stabilize therapeutically relevant protein complexes and antigen conformations for drug and vaccine discovery.
Collaborate with external groups to align internal data standards with industry and academic ontologies, ensuring standards are defined with downstream usage and analytics in mind. May also provide data source profiling and advisory support to R&D teams.
Provide bespoke subject matter expertise for R&D data, translating deep scientific concepts into data structures that enable actionable insight. xhfqzwm
Contribute to and maintain documentation of data standards, ontology decisions, and mapping rationale to support organizational knowledge transfer and auditability.
Why You?
Basic Qualifications
Master’s degree in Bioinformatics, Biomedical Science, Biomedical Engineering, Molecular Biology, or Computer Science (with a life science application focus)
6+ years of relevant work experience
Demonstrated experience contributing to Knowledge Graph development efforts, including entity modeling, relationship design, and schema governance
Hands‑on experience with open‑source ontology tools and languages such as Protégé, SPARQL, OWL, SKOS, SHACL, RML, and RDF/Turtle
Working knowledge of major life sciences ontologies, including Gene Ontology (GO), OBO Foundry ontologies (e.G., CL, UBERON, HPO, MONDO, CHEBI, EFO, CLO), MeSH, SNOMED CT, and UMLS
Familiarity with linked data principles and semantic web technologies
Experience using industry‑standard tools for data serialization and modeling (e.G., JSON Schema, LinkML)
Proficiency in at least one programming language—preferably Python—for scripting vocabulary mappings, building data models, automating quality control, and prototyping pipelines
Preferred Qualifications
Experience with data governance and data quality tooling (e.G., Ataccama, Informatica, Talend, OpenRefine, Great Expectations, dbt)
Experience supporting AI‑readiness or LLM integration workflows, including metadata enrichment, entity linking, embedding pipelines, or retrieval‑augmented generation (RAG) architectures
Understanding of vector databases and their role in semantic search and knowledge retrieval (e.G., Weaviate, Chroma)
Familiarity with cloud data platforms and infrastructure relevant to large‑scale biological data (e.G., AWS, GCP, Azure)
Familiarity with graph database technologies (e.G., Neo4j, Amazon Neptune, Stardog, GraphDB, TigerGraph)
#J-18808-Ljbffr