Monday, June 26, 2006

Harnessing the Wisdom of the Crowds

Recently, Perry Peterson sent me a copy of a lecture by Peter Nicholson (President of the Canadian Council of Academies) title Harnessing the wisdom of the crowds: the new contours of intellectual authority. In this paper he is essentially arguing, “that what qualifies as intellectual authority in contemporary societies – who and what to believe – is changing fundamentally”. His thesis in a nutshell is this. “People today are much less prepared to defer to the experts. But at the same time, we are being swamped with data and information – a glut that cries out for analysis and summary. So there’s a dilemma. Who to turn to? Increasingly the answer is – Well, to ourselves of course, as individuals empowered by a world wide web that has rapidly evolved into a social medium. More specifically, it is a medium that today supports massively distributed collaboration on a global scale that – we can only hope – will help us make sense of it all.” His thoughts are highly influenced by James Surowiecki’s The Wisdom of the Crowds, and Mike Kapor’s notion of massively distributed collaboration.

This is exactly the dilemma where biological systematics and conservation is stuck. Traditionally, we all rely on expert opinions, some of it well documented with detail down to one single specimen, and thus single observation or collecting events. Others are very cursorily, and only a name of a species is available, because all the rest is kept off the crowd by copyright, and when you get finally the original publication, you find only a very rudimentary summary information, such as “Central America”.

Similarly, the important RedLists of threatened animals and plants is for believers, since it is not possible to get to the base line data, which in many cases is hidden in somebody’s brain or drawer, to be opened only for good friends or money. There is no way, a RedList, or even the more advanced global amphibian assessments would hold out against such a scrutinzation as current climate model do (see for example the debate in the New York Times, June 23, 2006 Panel Supports a Controversial Report on Global Warming”).

In all the cases, it would be much better, if all the data would be open access, and anybody could have a look at the data. Google Earth just illustrates the limitation of the approach described above. Whereas we can now look at almost any place in the world at a 25 Meter pixel resolution (a single reflectance value on the ground at 25 x 25 meter square), we are made to believe that an animal or plant is living somewhere in Central America.

If I am a local naturalist, or working at San Diego Super Computer Center, I want to have the highest possible resolution of the data. As naturalist, because I want to rediscover this organism, and add more observation, especially, if this is an endangered and not well known species. As somebody with powerful computer support, I would like to model and understand the niche of this species. I do not want to go and ask first somebody to be so nice to give me the data.

Perry’s company Pyxis Innovation can play a very important role in the transition, if they (and GBIF who is paying to develop a data viewer) come up with a tool allowing to find and visualize specimen and ecological data. If they even provide, or integrate existing analytical tools, such as those offered at CRIA, then we’ll improve our science tremendously, leverage the support of the crowd, and at the same time, we could much better control, who actually is providing access to their data. And this is needed, if we do want to be able to create something like a global biodiversity monitoring systems.

Since there is no standard in peer review of systematics data, the simple access to all the raw data will probably the single most decisive factor to improve the knowledge of our species, since this allows at least to criticize any piece of scientific work.

If Rod Page get’s his way with ispecies, his mash-up approach, we actually would have also a way to bring together on one page all the relevant information on a particular species, such as its DNA, images, distribution records, literature, etc (see semant, and a ppt there on this issue).


