Wednesday, February 27, 2008

The Launch of


Knowledge of the actual number of species on planet Earth is one of the last frontiers in science. It is not known exactly how many species have been identified and described, much less the number of as yet undescribed species.

However, the species we do know are documented in well over hundred million pages of printed scientific books and journals. – This knowledge is hidden in libraries, and no single library holds all this knowledge.

The species descriptions are very rich in data, essentially a quality controlled summary of what is known at any specific time about a particular species. In best cases, this information includes a detailed morphological description, drawings and images, a summary on behavior and ecology and a detailed list of all the specimens studied. In more recent publications, links to DNA sequences or video documentation – among other forms of data – may be provided. Recently e-publications have become available, but many of these are copyrighted and thus not generally available open to the public for perusal or use. Nor are they easily machine-searchable for discovery and re-use of contents.

Recently, the Biodiversity Heritage Library as a large scale operation to digitize this biodiversity literature has been launched. Currently, it includes major US and UK natural history libraries, with the ultimate goal of including the entire global literature. All publications will be openly accessible to the public, unless they are copyrighted -- thus most of the recent publications are still out of reach. The BHL thus falls short of optimizing the potential uses of these publications.

Tagging the “boundaries” of a species description and identifying the species dealt with, supports discovery and retrieval of data not possible through Google. Mark-up of species descriptions permits queries, such as which are the "red ant in London", a very common form of query.

Under some national copyright legislation like the Swiss, descriptions can not be copyrighted because they are through historical constraints (there are tens of millions of descriptions) and peer review standardized and listing factual, in most cases morphological data describing species, and thus they can all be made readily accessible. is a new Web based service that offers access to descriptions of species and an archive to store the publications as marked up documents. GoldenGate, a dedicated editor has been developed to mark up the publications supporting the extraction of descriptions, based on a TaxonX, an XML schema modeling the logic content of these publications. The Plazi Search and Retrieval Server, building on this systematic mark-up of texts, allows powerful search functions to find species descriptions, or even simple mention of species, permitting users to answer questions like: “Which species occur together”? includes already more than 3,700 description of 3,000 taxa with a goal of archiving all the forthcoming new descriptions and, contingent upon additional funding, all the descriptions of the known 12,278 ant species listed in the Hymenoptera Name Server/, enhanced with globally unique species numbers (LSID’s: Life Science Identifiers). While ants provide the original test case, the service is not restricted to ants but is potentially open to all groups, from Bacteria to Plants, and will support most major languages. All descriptions are machine readable and thus can be picked up for mash-ups or individual Websites. is run and developed by Donat Agosti, Terry Catapano, Christiana Klingenberg and Guido Sautter, its development is supported by Grants from the US National Science Foundation (to the American Museum of Natural History: Christie Stephenson and Tom Moritz), the German Deutsche Forschungsgemeinschaft (to University of Karlsruhe: Klemens Böhm) and the Global Biodiversity Information Facility (GBIF; to and Zootaxa), and is collaborating with the Hymenoptera Name Server at Ohio State University (Norm Johnson), Zoobank (Richard Pyle), University of Massachusetts (Robert Morris), (Brian Fisher) and Zootaxa (Zhi-Qiang Zhang). has been released to the public at the EDIT "IPR and the web: challenges for taxonomy" meeting in London, Feb. 20, 2008

Related Links:


Post a Comment

<< Home