The main aim of Beacon v2 is to enable data discoverability of datasets with criteria of interest, including clinical, demographic, experimental and variant level data (Rambla et al., 2022). It provides a common data model and a framework specification for how the metadata can be queried via API (Rambla et al., 2022). Many Beacons can be aggregated into a single network, allowing single queries to be sent to multiple individual Beacon instances (Rambla et al., 2022). It was approved as a GA4GH standard in April, 2022 (GA4GH, 2022).
The Beacon v2 model is based on seven object types with default schemas. Schemas have minimal required fields and can be expanded with additional fields to suit the needs of a particular implementation. The model structure is highly nested and ontologised. There are recommended ontologies for particular fields but nothing is enforced as it is expected that communities will choose ontologies based on what best suits their use case. This offers flexibility but may present challenges when aggregating multiple Beacons with different ontology curations.
Implementations
Since Beacon is a specification framework, anyone is free to develop their own implementation which meets the specification. Implementations can be registered on the EGA registry server which assesses how well an implementation matches the spec. A few implementations are described in more detail below as well as on the Beacon documentation.
Reference Implementation
The reference implementation was developed by the CRG and is built on an internal Mongo DB with APIs defined in Python (Rueda et al., 2022). It is intended as a starting point for those who want to try out Beacon and gives them a head start on the resourcing required to light a Beacon.
There are tutorials available in order to understand how to ‘beaconize’ your data, ingest it into the database and set up the API (Rueda et al., 2022).
The two main GitHub repos are:
Beacon v2 Reference Implementation (Data ingestion tools)
Beacon v2 Reference Implementation (API)
Serverless Beacon (sBeacon)
A serverless implementation of the Beacon specification has been developled by the Transformational Bioinformatics team at CSIRO AERC. This ‘on-demand’ implementation can offer a fast and cost-effective way of running a Beacon as the cost is proportional to how much it is used, rather than a constant running cost. Originally published to the version 1 specification, developers are now updating the implementation to support Beacon v2.
An investigation of the in-progress v2 sBeacon implementation by the team at the University of Melbourne Centre for Cancer Research can be explored here:
Java Beacon (jBeacon)
A Java implementation of the Beacon v2 Specification has been developed by the Barcelona Supercomputing Centre (BSC) (Repchevsky et al., 2022).
An investigation of the jBeacon implementation by the team at the University of Melbourne Centre for Cancer Research can be explored here:
Beacon networks
ELIXIR Beacon network
The ELIXIR Beacon network brings together 11 beacons from across Europe. Most Beacons are at version 1. It is possible to register a Beacon into this system following the instructions.
GA4GH Beacon network
The GA4GH Beacon network aggregates data from over 40 institutes. It contains variants mapped to GRCh37 and no contextual metadata as it is aggregating version 1 beacon data. It is maintained and hosted by DNAstack. It isn’t clear whether this is actively maintained and will be adapted to the version specification.
References
- Rueda, M., Ariosa, R., Moldes, M., & Rambla, J. (2022). Beacon V2 Reference Implementation: a Toolkit to enable federated sharing of genomic and phenotypic data. Bioinformatics, btac568. https://doi.org/10.1093/bioinformatics/btac568
- Repchevsky, D., Capella-Gutierrez, S., & Gelpí, J. L. (2022). Open source Java implementation of the Beacon v2 API. F1000Research, 11. https://doi.org/10.7490/f1000research.1118980.1
- GA4GH. (2022). New release of GA4GH Beacon expands genomic and clinical data access [Article]. https://www.ga4gh.org/news/new-release-of-ga4gh-beacon-expands-genomic-and-clinical-data-access/
- Rambla, J., Baudis, M., Ariosa, R., Beck, T., Fromont, L. A., Navarro, A., Paloots, R., Rueda, M., Saunders, G., Singh, B., Spalding, J. D., Törnroos, J., Vasallo, C., Veal, C. D., & Brookes, A. J. (2022). Beacon v2 and Beacon networks: A “lingua franca” for federated data discovery in biomedical genomics, and beyond. Human Mutation, 43(6), 791–799. https://doi.org/10.1002/humu.24369
Relevant tools and resources
Skip tool tableTool or resource | Description | Related pages | Registry |
---|---|---|---|
Beacon v2 | API framework and data model for cross-cohort searching. | How to add to and edit... GA4GH | Standards/Databases Publication |