I gave a talk for PALINET some little while ago about institutional repositories. The audience had been primed by the fantastic Peter Murray to think about looking after digital content as the “fourth great wave” of library work. (I wish that talk was online. It was absolutely brilliant.)
But not everyone was entirely onboard with that. I recall distinctly one distinguished-looking white-haired gentleman raising his hand. “We in libraries,” he said (paraphrase mine), “have historically been purveyors of quality information. Authoritative information. On what basis should we jeopardize that raison d’etre for institutional repositories?”
Brave man, and he expressed well a resistance I’ve felt in my librarian colleagues near and far as long as I’ve been running IRs. Why do you collect that, they ask without asking. IRs established alongside established digital-library programs suffer worse, the parvenu being simply too declassé to mention in the same breath as library-blessed digital collections. The funny thing is, in a lot of these situations I suspect the digital library was resisted for a long time too; I suppose I can only shrug and be mildly pleased that IRs legitimize digital libraries by being the next target of scorn.
The thing is, if libraries are going to involve themselves in digital curation, we’ll have to get over our yen for authority and finality. Even, dare I say it, quality.
Part of the reason for this is that in many fields, data-quality standards haven’t been worked out yet. Cowboy data curators have to do their best and hope. Over time, this problem is likely to become less salient, which I expect will also lessen librarians’ resistance to data curation?but I doubt the issue will ever go entirely away.
A related part of the reason is that data authority is a vexed question, and in most cases (it seems to me) the data will have to be collected and cared for well before the question of authority can be resolved. We just won’t know what data are usefully authoritative until the researcher community has chewed them over a bit.
Part of the reason is that if we want decent-quality, well-described data, we just can’t sit around until it’s final. I’ve any number of war stories about stupid data that didn’t have to be stupid; its collectors just didn’t think through what they were doing until it was much, much too late. A librarian?any librarian!?could have asked the right questions and pointed to some of the right answers, but only early enough in the process to ensure that librarianly insights made it into the data-gathering process.
Sometimes, for all our best efforts, we’ll find a dataset that needed an intervention that it didn’t get. Sometimes, we’ll have to sigh and take it anyway. Irreplaceability is one cogent reason to do so.
I expect that many librarians will find this an unpalatable set of outlook changes. The only counter I have is that they are necessary outlook changes if we are to participate in this service cluster.