An encyclopedia of life?

I wish I could be more enthusiastic about the Encyclopedia of Life project. It's to be an online encyclopedia with a substantive page dedicated each species on the planet, and it's endorsed by E.O. Wilson, with sponsorship from some of the most prestigious museums around. It's a fantastic idea that would be incredibly useful.

But then …

The demonstration pages are beautiful, maybe too beautiful. There's the promise of a colossal amount of information in each one, although at this point all they've got are very pretty but nonfunctional images of what the page will look like — but you can see that the content is not trivial and the organization is detailed. I browsed the FAQ to see how they're going to do it, and it's awfully vague. "Mashups" of existing databases? Recruiting sources from existing online collections? They were inspired by Wikipedia? (No, please no…if it's wikified it will be useless as a source of technical information.) I look at the money they've got — $12 million — and the number of species they aim to catalog — 1.8 million — and it just doesn't add up to me. Aren't they going to burn through that much money in just paying for the emergency room visits for the brawling systematists fighting over the ontological issues?

I don't mean to sound so negative, since I think it's an eminently laudable goal, but I get very, very suspicious when I see all the initial efforts loaded towards building a pretty front end while the complicated core of the project is kept out of focus. I'd be more impressed with something like NCBI Entrez, which, while not as attractive as the EOL mockups, at least starts with the complicated business of integrating multiple databases. I want to see unlovely functionality first, before they try to entice me with a pretty face.

(via John Logsdon)

More like this

The new Encyclopedia of Life may be the best new thing since sliced bread, but not necessarily just because a catalog of every living species is a pre-requisite to understanding our planet. By making it clear just how little we actually know about life on Earth, EOL could be just the thing biology…
The imminent release of an embryonic Encyclopedia of Life (EoL) has journalists buzzing about an exciting new online resource. I wish I could share their enthusiasm. EoL has announced 1.7 million species pages within a decade, providing biological information for all of the world's described…
Have you heard of the Encylopedia of Life? If not, get out from under the rock, dude. Seriously. The hype machine has been going at full steam. This is supposed to be a database of all known species of organisms on earth. It's the incarnation of E.O. Wilson's call for a database of all species. It'…
Have you ever wondered how to find things in the NCBI databases? Maybe you tried to find something but didn't know how it was spelled. Or maybe you tried to use a common name like "pig" or "deer" to find information in a database, not knowing that all the organism names are in Latin. Or perhaps…

I'd actually be encouraged by the demonstration pages. I've done enough technical sales of web applications to know that glitz sells, and having the beautiful demonstrations serves to reinforce your goals & requirements. Not having these is usually what gets you into trouble. And as an encyclopedic source, it looks very comprehensive, so long as only experts are allowed to modify entries, I can see this as being a very useful popular reference source. And from what I understand, the $12M is just a quarter of the expected, so it looks reasonably well funded, for now.

I'm pretty skeptical of this as well. I at first thought it would be a good idea, but then the similarities to Wikipedia became to great. The whole "mashups" of existing data bases is what gets to me. This just seems like a lot of prettily packaged unsifted words.

If the information were limited to CREDIBLE sources, this could be really cool... BUT...

A "mashup" doesn't bother me that much -- that's actual a good idea. If you look at Entrez, you'll see that that's what they do, tying together many databases so that you can get a window on a whole constellation of information with a single search.

Modeling it on Wikipedia would be a disaster, but they only say they are "inspired" by Wikipedia, whatever that means. They claim to be heartened by the observation that 1.5 million entries were put on Wikipedia in four years, but I'm disheartened by that. That's what you get when you open the doors wide to everyone, but I'm assuming that EOL will only be opened to a very limited set of experts.

It also ignores the fact that a lot of those 1.5 million entries are answers to trivia questions, like complete cast and episode lists for Babylon 5. A wikipedia entry is simply not comparable to an EOL entry.

There are also probably far more people interested in and knowledgeable of the (well-deserving of attention) phenomenon of Babylon 5 than the finer points of various species.

It's a lot easier to have the general populace work on pop-culture topics than biology.

Of course, if the editors were limited to people from the fields themselves, but the general public could access it... that could be very, very cool. There's definitely potential in the concept - the question is whether this particular instantiation will actually deliver on that potential. Being inspired by Wikipedia is a mark against it, I'd say.

By Caledonian (not verified) on 09 May 2007 #permalink

According to New Scientist, "Scientific organisations have already pledged $50 million, including EOL's host, the Smithsonian Institution in Washington DC, US." So it's not quite as bleak financially, but at the moment that still averages to just about $28 a species.

...and of course almost nothing is known about most of those 1.8 million spp. except the fact of their recent existence.

PZ Myers:

They claim to be heartened by the observation that 1.5 million entries were put on Wikipedia in four years, but I'm disheartened by that. That's what you get when you open the doors wide to everyone, but I'm assuming that EOL will only be opened to a very limited set of experts.

Exactly. One should note that this can also be implemented using a wiki, even the MediaWiki software which underlies Wikipedia. It's pretty easy to configure MediaWiki so that only users with accounts can edit pages (or even read them, if that's what you'd prefer), and then control who can get user accounts. That way, you have the collaborative platform, revision logging and other good stuff that the wiki approach gives you, without the open floodgates and the consequent deluge of Pokemon animals.

I agree with those who are calling for expert editors, but readable by everyone. An open Wiki produces - well, Something Awful is a web site that highlights the worst of the web, either by screen capture or by parody. Most of their parodies are themselves pretty bad (it's not an easy art form), but they recently hit the nail on the head as far as Wiki:

http://www.somethingawful.com/d/news/awfulpedia-babies.php

while not so pretty, iSpecies.org is a species search engine that also does mash up data from different sources (including NCBI stuff) for organisms.

I want to see unlovely functionality first, before they try to entice me with a pretty face.

Oh, PZ, if only one of my clients were ever to say that, I would be so happy! Usually I have to deal with people who know how they want an application to look (often in excruciating detail), but aren't quite sure what it should do.

I've done enough technical sales of web applications to know that glitz sells, and having the beautiful demonstrations serves to reinforce your goals & requirements. Not having these is usually what gets you into trouble.

Which - the beautiful demos, or the goals and requirements? I've had clients literally say "We don't care if it works, as long as it looks good" (and that's a direct quote, not hyperbole). I have had clients with beautifully detailed layouts from professional design agencies, but only the vaguest idea of what their actual business model might be, never mind the level of detail that you would need to even begin designing the layouts sensibly. How do you design a page before you've figured out what needs to go on it?

Needless to say, such projects never turn out well...

The Chicago Tribune has that it is predicted to cost $100 million by the end. This is still $55 per species, but not all species will require so much time and effort. I presume that there are a large amount of species which will be able to be covered rather quickly and inexpensively. This may raise the available average for the more obscure species.

Either way, I am optimistic that with funding from such prestigious sources the project will have more pressure to be scientific and legitimate; wikipedia just seemed to be more of an emergent phenomenon of average collective information.

I'll be somewhat MORE negative than PZ. It's a 'wombat' project. The shallow 'field guide' mentality has been transferred to the 'desk guide.' Field guides have done much to foster awareness of the natural world-- but, here's some titles you'll never see: a field to the Gelechiid moths of eastern North America, AFG Pteromalid wasps east of the Mississippi, AFG to the genus Euxoa (cutworm moths, in part) of North America. AFT to the Miridae (Plant bugs) east of the Great Plains. Popularity aside, these are a few of the important insect groups that cannot be done in the FG format-- but most of the material is already available in the primary research journals.

If there's an extra 50 million floating around for systematics work. Let's trowel it into molecular genetics or alpha taxonomy. By way of example: In eastern North America, 3 (that's three) NEW species of swallowtail butterflies have been described since 1980. I've described TWO new species of Noctuid moths found in godforsaken North Dakota (the type localities were elsewhere). One look at bird taxonomy, there's much to be done. Recent molecular work showed that the Cuban Ivory-billed woodpecker was distinct from its extinct American counterpart, being as far removed as from the Mexican Imperial woodpecker. There is enough ground breaking taxonomic work left to be done for many generations of humanity, and a time frame of perhaps less than one generation to accomplish it in.

I'd rather see them combine forces with the Tree of Life pages. There is already a detailed methodology for Tree of Life that involves getting experts on particular taxa to write the pages, it includes discussions of uncertainty in unresolved portions of the tree, and is quite thorough and detailed when they get an author to submit (of course getting busy researchers to take the time to do so is probably the main reason the taxon sampling is still fairly small there).

I don't see anything wrong about a wiki approach if the editors are selected professionals, given that there are enough editors. MediaWiki is just a technology. The pages apparently have many bits of information, which is just right for a wiki approach.

Oh well, apparently that is not the case of the EOL.

By Dunkleosteus (not verified) on 09 May 2007 #permalink

Larry Moran won't be pleased that they're using the 3 domain system.

I think you missed something important:

"Unlike conventional encyclopedias, where an editorial team sits down and writes the entries, the Encyclopedia will be developed by bringing together ("mashing up") content from a wide variety of sources. This material will then be authenticated by scientists, so that users will have authoritative information. As we move forward, Encyclopedia of Life and its board will work with scientists across the globe, securing the involvement of those individuals and institutions that are established experts on each species."

This approach of having a global network of taxonomic experts involved is already being done by ITIS and FishBase, among others.

No, I caught that. That's exactly what would work: combining existing databases, and having expert authentication.

My objection is that that isn't easy or cheap, and that's precisely the hard part of this project. The mockups don't demonstrate proof-of-concept of the hard part, they only show off the elegant work of some web designers.

No props to Wikispecies? It's a project that launched in August 2004 and is sponsored by the same foundation behind Wikipedia. http://species.wikimedia.org

Of course it's an open wiki which are hated by some but loved by everyone else.

It's a nobel goal, and I'm excited to see people try it. Even if they never cover 100% of species it would still be a great resource. If they take a top down approach it seems that a short term goal of covering at least 1 species per genus seems reasonable. I would guess that the mammal section will get completed fairly fast as well. The long tail will be the hard part, but is it really a travesty if they only manage to cover the top 100,000 species of beetles?

One key will be to get administrators of organizations that employ taxonomists to REWARD taxonomists for getting historical and new species descriptions on to EOL.

Successful integration with other databases is also going to be key. As noted, EOL should connect with the Tree of Life website, but the information provided is not the same, with ToL providing a historical hierarchy connecting each of the EOL pages. But think of all those nodes in the Tree of Life that will need descriptions! NCBI, ITIS, etc. should similarly be linked up to EOL pages. Moreover, a bunch of different relevant databases aimed at particular taxa already exist (FishBase, CephBase, Hexacoral, etc.), and a number of these face the real possibility of becoming financially orphaned. They were started with some funds associated with PEET or AToL projects or whatever and those funds are going to run out. The trouble is, many of these have more functionality than EOL, geographic ranges, environmental and/or morphological data and they'll still need support to be maintained. I'm guessing proposals to support the creation of some grand database are typically more attractive than grants to keep a nice database going.

I get very, very suspicious when I see all the initial efforts loaded towards building a pretty front end while the complicated core of the project is kept out of focus..I want to see unlovely functionality first, before they try to entice me with a pretty face.

You need to get with the program, PZ; that's the perfect description of Web 2.0!

It's all the rage down our way.

By Ian B Gibson (not verified) on 09 May 2007 #permalink

No, I caught that. That's exactly what would work: combining existing databases, and having expert authentication.

My objection is that that isn't easy or cheap, and that's precisely the hard part of this project. The mockups don't demonstrate proof-of-concept of the hard part, they only show off the elegant work of some web designers.

I guess I haven't seen evidence that all they are planning to do is focus on a pretty front end. My experience with big science is that one has to have *something* to show potential users, investors, and contributors early on to generate support and interest. Show that it can work in principle, and then it is more likely that you will get funds and people to carry it through. Again, I point to FishBase or ITIS or the BioSystematic Database of World Diptera or Tree of Life. The approach can work, and it seems to be the only way to make the massive information on the world's diversity open to non-experts. On that basis, I am behind the project, though obviously it will have to succeed where All Species did not.

"Nothing will ever be attempted if all possible objections must first be overcome."
- Samuel Johnson

This reminds me of my father's frequent out-loud wonderings, which all seem to go something like:

"Well, I just don't see why they can't make a website that will just find whatever you want for you."

If I were going to make an online encyclopedia of all living things, I'd make it a hybrid of the visual thesaurus (to see how things are connected), kartoo (to show divisions and groupings), and wikipedia (simple visual style for the delivery of the information, easy on the eyes.)

By Will Von Wizzlepig (not verified) on 09 May 2007 #permalink

Just a driveby to mention an amusing point: I was reading Parkinson's Law and Other Studies in Administration (yes, it was originally a Economist article, and got compiled in a book with a bunch of other articles), and one chapter was describing, in hilarious detail, how organizations seemed inevitably to be dead when the perfect architecture for them was complete.