Submission 31

Submission 31

Cancer genomics studies continue to generate massive volumes of data that can vary greatly in terms of primary and metadata file formats, naming conventions of attributes like gene disease names, positional coordinates and choice of reference genome, and other factors. This variability is then amplified by the non-uniform representation of such data across numerous distributed repositories with distinct individual foci. OncoMX, a web portal with community access to an integrated cancer mutation and expression database, is actively being developed to address these and other known issues of biomedical data integration to better facilitate cancer biomarker research and discovery. OncoMX is an international collaboration between the George Washington University, NASA’s Jet Propulsion Laboratory, the Swiss Institute of Bioinformatics, and the University of Delaware. Through this collaboration, sequencing-based mutation and expression data from large scale cancer studies and databases, including TCGA, ICGC, ClinVar, COSMIC, and more, are integrated and unified by Disease Ontology and Uberon anatomical structures into BioMuta and BioXpress, the core knowledgebases for the OncoMX portal. Supplemental data for normal expression across organisms is integrated from Bgee, and text-mining results for both mutation and expression in cancer are generated by custom tools. Additional annotations are retrieved from Early Detection Research Network (EDRN) and a number of resources to aid in end-user interpretation of candidate variant and expression biomarkers. It is expected that the proposed access to the integrated data and supporting information in OncoMX will promote efficient synthesis and consumption of information by end-users and ultimately aid cancer biomarker research. OncoMX is supported through NCI ITCR U01 funds.