News Release

Genomic data in GBIF moves a step closer

Aligning standards will help share information on biodiversity yet to be discovered

Peer-Reviewed Publication

Global Biodiversity Information Facility

Copenhagen, Denmark – Important progress has been achieved towards including genomic-level information in the data made freely available through GBIF.

Successful alignment of informatics standards for recording species occurrences and gene-sequence descriptions has opened up new possibilities for integrating the different types of data.

The mapping of three standards was completed at a GBIF-led workshop in Oxford, United Kingdom bringing together experts from Europe, the United States, China and Japan.

A testing programme will shortly begin to bring data from several repositories of genomic information into the GBIF network, using adaptations of the Darwin Core (DwC) standard for sharing biodiversity data.

The developments will fulfill an objective of GBIF's Strategic Plan 2012-16, which calls for the network to accommodate new types of data, in order to give access to information on the estimated 90 per cent of the world's biodiversity still to be discovered – the currency of which will largely be genomic information.

The workshop hosted by Oxford University's e-Research Centre continued the collaboration between GBIF and the Genomics Standards Consortium (GSC), an international community promoting mechanisms to standardize descriptions of genomes and the environmental context in which they occur.

The small group of experts succeeded in completing the mapping or aligning of the Darwin Core, favoured standard for publishing specimen or observation data through GBIF, with the Minimum Information about any (x) Sequence (MIxS), a standard for genomic data developed by the GSC.

The workshop also successfully mapped MIxS with the proposed standard of the DNA Bank Network, known as ABCDDNA.

Two extensions to the Darwin Core Archive (DwC-A) were prototyped, enabling genomic biodiversity data to be published to the GBIF network. The intention is to test this new capability through the following data repositories:

  • The World Federation of Culture Collections; WFCC is already a GBIF Associate Participant, and its Beijing-based data centre, the World Data Centre for Microorganisms (WDCM), is developing the WFCC Global Catalogue of Microorganisms;

  • The Moorea Biocode Project, which is creating the first comprehensive inventory of all non-microbial life in a complex tropical ecosystem on a Pacific island, including construction of a library of genetic markers and physical identifiers for every species of plant, animal and fungi;

  • SILVA, a project based in Germany which provides up to date, quality controlled databases of aligned rRNA sequences from the Bacteria, Archaea and Eukarya domains;

  • The MicrobeDB.jp project including MEO, which is supported by National Bioscience Database Center (NBDC), Japan;

  • MG-RAST, a US-based automated analysis platform providing quantitative insights into microbial populations based on sequence data; and

  • megx.net, a site providing integrated access to environmental and (meta)genomic data for marine microbial ecology.

Commenting on the progress achieved at the workshop, GBIF's Executive Secretary Donald Hobern said, "One of the great challenges for GBIF, and for our many partners around the world, is to help to unify data from different domains to enable researchers to ask new questions, and to ensure that we have access to all relevant resources to understand biodiversity and interactions between biodiversity, human activity and changing environments.

"This workshop gives us the opportunity to make these connections for a wide range of important datasets and positions us all to handle vast amounts of genomic data in future years." Further information on specific workshop outcomes is available through a new group on the GBIF Community Site at http://community.gbif.org/pg/groups/22216/genomic-biodiversity-data/.

The workshop was a collaboration with the Research Coordination Network for the GSC (RCN4GSC) project, funded by the US National Science Foundation, which is promoting the integration of genomic standards with ecological and species level standards through a series of workshops.

Steps were also taken at the workshop to translate Darwin Core terms into Japanese and Chinese, taking advantage of the participation of experts from those two countries.

###

For further information, contact:

Éamonn Ó Tuama
GBIF Secretariat
eotuama@gbif.org


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.