ArrayExpress is a repository for functional genomic data, usually transcriptomic raw reads but was originally created for microarray data (thus the name) run by the EBI (Athar et al., 2019). Submission is through the annotare web interface. Behind the scenes, metadata gets submitted to Biosamples and raw sequence files get submitted to the ENA but with upgraded metadata. Expression matrices can also be submitted to ArrayExpress. Each dataset is curated to ensure it meets the ArrayExpress MAGE-TAB metadata standard for the particular type of experiment and data being submitted. As raw data is stored in the ENA and made available openly, it is not suitable for data that requires controlled access. It is possible to create submissions in ArrayExpress that contain openly available metadata along with a summary expression matrix and link to raw data in the EGA to allow for more discoverability of controlled access functional genomic datasets via ArrayExpress.
Many ArrayExpress experiments have ‘ExpressionSet’ objects that can be downloaded and loaded directly into R. They can also be accessed through the ‘ArrayExpress’ Bioconductor package (Kauffmann et al., 2022). Data and metadata in ArrayExpress can be queried programmatically through the BioStudies API endpoints.
Data is either submitted directly by contributors or curated from other sources such as the HCA and GEO.
References
- Kauffmann, A., Emam, I., & Schubert, M. (2022). ArrayExpress: Access the ArrayExpress Microarray Database at EBI and build Bioconductor data structures: ExpressionSet, AffyBatch, NChannelSet. Bioconductor version: Release (3.15). https://doi.org/10.18129/B9.bioc.ArrayExpress
- Athar, A., Füllgrabe, A., George, N., Iqbal, H., Huerta, L., Ali, A., Snow, C., Fonseca, N. A., Petryszak, R., Papatheodorou, I., Sarkans, U., & Brazma, A. (2019). ArrayExpress update – from bulk to single-cell expression data. Nucleic Acids Research, 47(D1), D711–D715. https://doi.org/10.1093/nar/gky964
Relevant tools and resources
Skip tool tableTool or resource | Description | Related pages | Registry |
---|---|---|---|
ArrayExpress | A repository of array based genomics data | BioStudies Expression Atlas Omics Discovery Index | Tool info Standards/Databases Training Publication |