Our Wikidata - Human Disease Ontology (DO) team is addressing the ongoing challenge of biomedical knowledge dissemination and integration through the implementation of a crowdsourcing curation model. Previously, we seeded the Wikidata semantic network linking genes, drugs and diseases with nodes and edges populated automatically by 'bots' that integrate data from trusted authorities such as NCBI’s Entrez Gene, PubChem, and the DO. This effort enhances Wikidata and Wikipedia and enables powerful integrative queries that span multiple domain areas.
Here, we describe a new system to monitor, filter, and prioritize changes made by Wikidata contributors to disease items. The identified changes are coordinated through the DO’s GitHub tickets (https://github.com/DiseaseOntology/HumanDiseaseOntology/issues), reviewed by a DO curator and integrated into the DO where appropriate. The three main types of changes include: modification of the ontology/hierarchy, addition or correction of external identifiers, and addition of other Wikidata properties. Through this process, we have established a new crowdsourcing model for targeted curation and integration of new knowledge into Wikidata and the DO. The crowd has added to Wikidata a total of 2681 GARD, MeSH, NCI, OMIM, Orphanet, or UMLS cross references, and 1774 subclasses have been added to 8511 disease items. Additionally, this effort has identified 152 potential new diseases for the DO added by Wikidata users.
The Wikidata crowdsourcing curation model is generalizable to other CC0 open-licensed biomedical ontologies. This approach offers a novel solution for integrating new knowledge into a biomedical ontology through distributed crowdsourcing while also enhancing resource exposure and engaging a broader community.