Skip to content Skip to footer

Repositories: BioSample

All major archives have some version of a ‘sample’ object that stores sample related metadata. The National Center for Biotechnology Information (NCBI) (SRA), DNA Data Bank of Japan (DDBJ) and China National Genomics Data Center - National Genomics Data Center (CNCB-NGDC) databases are limited to storing metadata that gets submitted when submitting a study to their related genome archives.

As well as storing metadata from samples that are submitted as part of an ENA submission, the EBI’s BioSamples database can be used for any form of sample metadata archiving and can be linked to other EBI archives at a later point. It is flexible to store any kind of key value pair and values can be linked to ontologies. Multiple BioSamples can be linked together, for example, a virus sample could be linked to a sample from its host through a ‘Derived from’ relationship. Samples may also be linked under a project by specifying a ‘project’ key.

Samples can be linked to The European Genome-phenome Archive (EGA) projects if sample metadata provided is openly accessible (e.g. https://www.ebi.ac.uk/biosamples/samples/SAMEA4940335 ).

Metadata can be submitted and queried via their REST API. There is a python wrapper though it is not actively maintained.

Can be used to provide a stable identifier to project samples as they are being processed and then linked to the final archival submission e.g. HipSci project example (Streeter et al., 2017), FAANG project example https://www.ebi.ac.uk/biosamples/samples?filter=attr%3Aproject%3AFAANG https://data.faang.org/specimen https://www.faang.org/

References

  1. Streeter, I., Harrison, P. W., Faulconbridge, A., Flicek, P., Parkinson, H., & Clarke, L. (2017). The human-induced pluripotent stem cell initiative—data resources for cellular genetics. Nucleic Acids Research, 45(Database issue), D691–D697. https://doi.org/10.1093/nar/gkw928

Relevant tools and resources

Skip tool table
Tool or resource Description Related pages Registry
BioSamples BioSamples stores and supplies descriptions and metadata about biological samples used in research and development by academia and industry. Tool info Standards/Databases Training Publication
China National Genomics Data Center - National Genomics Data Center (CNCB-NGDC) China's national initative to build platforms and infrastructure to support life science researchers in academia and industry BioProject Publication
DNA Data Bank of Japan (DDBJ) Japan-based Nucleotide sequence archive database and accompanying database tools for sequence submission, entry retrieval and annotation analysis. Repositories Tool info Publication
National Center for Biotechnology Information (NCBI) Online database hosting a vast amount of biotechnological information including nucleic acids, proteins, genomes and publications. dbGAP
The European Genome-phenome Archive (EGA) EGA is a service for permanent archiving and sharing of all types of personally identifiable genetic and phenotypic data resulting from biomedical research projects EGA Omics Discovery Index Repositories Tool info Standards/Databases Training Publication
Contributors