Annotare (https://www.ebi.ac.uk/fg/annotare/) is a user-experience driven web-tool for submitting microarray and sequencing datasets to ArrayExpress (https://www.ebi.ac.uk/arrayexpress/), a repository for functional genomics experiments. Annotare helps standardisation and validation of the collected metadata, and facilitates curation.
The submission process guides users through a series of steps collecting experimental concept, protocols, sample annotation, and the original and processed data. Annotare includes on-the-fly metadata validation, ontology term suggestion, wizard-style functions to quickly populate the forms and a feedback form for submitters to contact curators. Annotare transforms the collected metadata into the standardised MAGE-TAB spreadsheet format. More than 3000 experiments have been processed through Annotare since 2014, with a median feedback score from users of 8 out of 9.
All experiments submitted through Annotare are, without exception, manually curated by trained bioinformaticians with doctoral-level wet-lab research experience. Curation consists of a critical review of each dataset to provide a comprehensive representation of the experiment and ensures compliance with MIAME (http://fged.org/projects/miame/) or MINSEQE (http://fged.org/projects/minseqe/) minimum information guidelines. Curated datasets comprise the submitters' raw and processed data files, along with comprehensive descriptions of the samples produced, experimental design and protocols. This setup helps to ensure data reproducibility, allowing people interested in the original data to recreate it as well as understand how it was generated for re-analysis and reuse.
ArrayExpress is an established and secure database, guaranteeing long-term storage of data and metadata. ArrayExpress boosts the visibility of submitters' work, with approximately 1000 unique users per day, accessing over 70,000 experiments in total. More than 3000 of these experiments have been further analysed and included in Expression Atlas (https://www.ebi.ac.uk/gxa/). The selection criteria include minimum biological replications and good sample annotation. Expression Atlas is an added-value resource, where gene expression data can be queried to the level of single genes, organisms or condition of interest.