DOI! D!O!I! D-O-I! D.O.I.!

I love the DOI. It's the best thing since sliced bread. Actually, it's better than sliced bread - I can slice my own bread - but I can't do what DOIs do so easily.

If you've been living under a rock for a while, you might not know that a DOI is a document object identifier - it's a unique identifier at the article or chapter level (or really at any level - like each image, each paragraph, or the whole book). Like you have ISBNs for books and ISSNs for journal (titles).  What's really cool is that you can just put http://dx.doi.org/  in front of one, and get directed to the publisher's page for the article. What if you don't have access to the article at the journal's site?  Well, you can enter the doi into your institution's handy open url resolver thingy (like SFX) and it will find the best place for that document (somehow this works less well, not sure why).  It's persistent, it's interoperable, actionable (see the site: http://doi.org). The publisher can move the document around and keep the same url using these fabulous things (it's a handle, too).

Publishers have to pay for them - really just to keep the apparatus up - but it's worth every penny.  A lot of publishers have gone back to assign a doi to their whole digital backfile (to, oh, 1680 or something).

I'm not the only one who thinks they're handy, APA has required the inclusion of the DOI in the citation for a while now. Oh, and if you use Connotea or ResearchBlogging.org you can just enter the DOI to get it to fill in the rest of the information using the system.

A couple of niggling points:

  • there are still a couple of publishers out there who don't participate (I know! Isn't that crazy?)
  • sometimes the research databases don't export them in the citation or the direct export to a citation manager (WHY????)
  • sometimes you get an article before it has a DOI or maybe after there's a DOI listed on it, but before it's registered in the system. I think this is becoming rarer, but it's a PITA. I get an RSS feed of early view articles from my society's publication. Used to be if you read the feed immediately, it would be 50-50 that the doi would work (it would always work a week later). Now it seems to be pretty immediate.
  • some ebook vendors don't provide these at the chapter level. That's really nice when they do. It would be really nice if the various CRC netbases did.

If you enjoy reading about specifications and standards and all that jazz, check out their site. For the rest of us, use DOIs and be happy.

Tags

More like this

Is that all our vendors hear when they ask us to try out their new interfaces?  A couple of us were kvetching on friendfeed about this.   Lemme tell you a little story.  A little while ago a really important society publisher in the geosciences re-did all of their web pages and they were pretty -…
In today's PLoS Computational Biology: Adventures in Semantic Publishing: Exemplar Semantic Enhancements of a Research Article: Scientific innovation depends on finding, integrating, and re-using the products of previous research. Here we explore how recent developments in Web technology,…
I'm on a sub-sub committee to evaluate evaluation of consideration of adding a new recommender system to our discovery tools across my parent institution's libraries. The system costs money and programmer time (which we're very short on), but more importantly, there's a real estate issue, we…
Libraries and librarians connect people to information. That's what we do. So there's the information part and there's the connecting part. Librarians gather, collect, license, and purchase information in the form of books, scrolls, artifacts, journals, web pages. And there's a lot to selecting…

Honestly, there is one thing I've never quite gotten about DOIs. Why not add the four characters "doi:" in front to make it fit as a URN scheme? Is there already such a mapping specified? While I understand that DOIs are about more than just the Web, in a sense, so are URNs, so implementing DOI as a particular and very nice URN schema makes sense to me, but maybe I'm missing something that makes this unreasonable...

Do DOIs actually do anything for us that URLs would not do if only people did not insist on messing with them? So far as I can see, a DOI is only as good as the assurance we have from the relevant organizations that it will never be messed with in a similar way, never changed, deleted, lost, or otherwise rendered useless. How far can we really trust that will be so? Doesn't introducing the IDF (or whatever) into the system just introduce one more possible point of failure, one more organization that might become careless or moribund? If they were actually archiving documents, that would be a different matter, but DOIs (unless I completely misunderstand) are just a second, redundant set of pointers.

Why would it not be just as good (and easier) to extract a undertaking from the relevant organizations (publishers, presumably) that they will, in future, refrain from changing the relevant URLs? Presumably, to make the DOI system work, publishers (or whoever else archives the actual files) have already had to undertake to make sure either that the files are never moved, or that if they are moved, the DOI will always be promptly updated to point to the right place. How is this easier (and more reliable, because it relies on fewer fallible organizations) than simply refraining from changing URLs?

What am I missing here?

@Chris - I don't know about the URN thing, but it seems to be discussed a lot on the doi page. See: http://www.doi.org/handbook_2000/enumeration.html#2.9.3
@Nigel - well, I think we've proven that publishers can't and won't promise to keep the URL stable! The point for handles or purls, afaik, is that you have one stable url and then the longer one can change at will and you just update the registry. So if a journal or even the publisher gets sold or transferred or if it's archived by another service, there are some legitimate times to update the url.

@Nigel - In reality, ownership and systems change. Ideally, every site admin would keep up the integrity of their URLs, past and present. However the 404 error is far more common. When a journal changes hands or the hosting platform changes, the URL changes but the DOI does not. I think DOI's give the publishers a business case to keep up access to their content when the changes occur but do not come with the perceived admin overhead. It is a single point of updated information versus a nest of httpd.conf edits or .forwards.

I guess you, it's not "Aussie, Oi! Oi! Oi!", but "Documents, DOI! DOI!DOI!" :-)

By BioinfoTools (not verified) on 07 Sep 2009 #permalink

Edit: I guess for you, ...

(Sorry, rushed typing.)

By BioinfoTools (not verified) on 08 Sep 2009 #permalink