The Final Link between Legacy Literature and Bioclimatic Modelling
Patricia Kelbert1 and Quentin Groom2
1. Botanischer Garten und Botanisches Museum Berlin-Dahlem, Freie Universität Berlin, Germany
2. Botanic Garden Meise, Belgium
The EDIT Platform for Cybertaxonomy is a convenient tool for managing and editing details of specimens and observations. Also, the BioVel workflows for data refinement and niche modelling provide a powerful means to clean up and analyse the distributions of organisms. A way to join these seamlessly together was lacking so that, at the one end of the workflow, a researcher can manage their data in a user friendly interface, and at the other, sophisticated models of distributions can be generated. This problem was tackled by a task group at the recent pro-iBiosphere Hackathon.
One of the pro-iBiosphere pilots was to use legacy literature as a source of data on the historical changes to the distribution of Chenopodium vulvaria. Details of over 2000 observations and specimens were imported into the Common Data Model (CDM) database administered with the Taxonomic EDITor. Many of these data were extracted from legacy literature through a process of digitization and mark-up. These were imported as a whole into the CDM and are a valuable test dataset for bioclimatic niche modelling. In this way, heterogeneous data was homogenised to make it tractable to statistical analysis.
Until now, the link between database and workflow could only be performed by experienced users, who would need directly access to the database. During the hackathon the task group developed a new Java web-service within the CDM-library. This web-service takes the identifier of a taxon as input and returns a list of specimens or observation details. The precise fields returned were based on the prerequisites for reusing in the BioVel refinement workflow, but also contained other fields that might be useful in the future. In this manner we have completed the final link in a workflow that starts with 16th century botanists and ending with 21st century bioclimatic modelling.
Figure 1. A schema showing the flow of data from legacy publications to modelling workflows. The red arrow shows the additional link in the chain.

