The Book of Trogool

Not turning up our noses

I gave a talk for PALINET some little while ago about institutional repositories. The audience had been primed by the fantastic Peter Murray to think about looking after digital content as the “fourth great wave” of library work. (I wish that talk was online. It was absolutely brilliant.)

But not everyone was entirely onboard with that. I recall distinctly one distinguished-looking white-haired gentleman raising his hand. “We in libraries,” he said (paraphrase mine), “have historically been purveyors of quality information. Authoritative information. On what basis should we jeopardize that raison d’etre for institutional repositories?”

Brave man, and he expressed well a resistance I’ve felt in my librarian colleagues near and far as long as I’ve been running IRs. Why do you collect that, they ask without asking. IRs established alongside established digital-library programs suffer worse, the parvenu being simply too declassé to mention in the same breath as library-blessed digital collections. The funny thing is, in a lot of these situations I suspect the digital library was resisted for a long time too; I suppose I can only shrug and be mildly pleased that IRs legitimize digital libraries by being the next target of scorn.

The thing is, if libraries are going to involve themselves in digital curation, we’ll have to get over our yen for authority and finality. Even, dare I say it, quality.

Part of the reason for this is that in many fields, data-quality standards haven’t been worked out yet. Cowboy data curators have to do their best and hope. Over time, this problem is likely to become less salient, which I expect will also lessen librarians’ resistance to data curation?but I doubt the issue will ever go entirely away.

A related part of the reason is that data authority is a vexed question, and in most cases (it seems to me) the data will have to be collected and cared for well before the question of authority can be resolved. We just won’t know what data are usefully authoritative until the researcher community has chewed them over a bit.

Part of the reason is that if we want decent-quality, well-described data, we just can’t sit around until it’s final. I’ve any number of war stories about stupid data that didn’t have to be stupid; its collectors just didn’t think through what they were doing until it was much, much too late. A librarian?any librarian!?could have asked the right questions and pointed to some of the right answers, but only early enough in the process to ensure that librarianly insights made it into the data-gathering process.

Sometimes, for all our best efforts, we’ll find a dataset that needed an intervention that it didn’t get. Sometimes, we’ll have to sigh and take it anyway. Irreplaceability is one cogent reason to do so.

I expect that many librarians will find this an unpalatable set of outlook changes. The only counter I have is that they are necessary outlook changes if we are to participate in this service cluster.


  1. #1 Steve Hitchcock
    August 13, 2009

    Why should librarians give up their QC role for anything-goes IRs? Call it the digital flip. For print the filters are on the input side. For digital the filters are on the output side. Ask Google.

  2. #2 Dorothea Salo
    August 13, 2009

    Well, the funny thing is, librarians’ QC role, considered in the context of the entire process of knowledge production, is actually pretty minimal. We decide what to buy — AFTER the whole process of selection, editing, review, and publication.

    It’s not authoritative because we selected it. We selected it because we thought it was authoritative.

    So another way of saying what I said in this post is that we’re getting involved with data well before the QC processes that we take for granted. The thing I find nifty about that is that our involvement may actually help the data be higher-quality and more authoritative!