The Book of Trogool

The accidental informaticist

The publisher Information Today runs a good and useful book series for librarians who find themselves with job duties they weren’t expecting and don’t feel prepared for. There’s The Accidental Systems Librarian and The Accidental Library Marketer (that one’s new) and a whole raft of other accidents.

I suspect “The Accidental Informaticist” would find an audience, and not just among librarians.

The long and short of it is, we just don’t know who is going to do a lot of the e-research gruntwork at this point. Campus IT at major research institutions is seizing on the fun grid-computing work, and they’re at least investigating collaboration solutions, but at least some of them seem to be balking pretty hard at providing the big disk necessary for data curation, never mind the human resources necessary to do data curation anything like right.

Having campus IT handle these services can also create a tremendous gap between haves and have-nots. Grant-funded science can pay into cost-recovery operations, which many campus IT shops are. Grant-funded science can hire its own IT if it has to, as well as dedicated informaticists (though admittedly they mostly don’t). Anybody who isn’t grant-funded science and has data? Is out in the cold.

It’s worth noting as well that even grant-funded science doesn’t often think past grant expiration. For collaboration tools and grid computing resources, that’s fine. For data curation? Not so fine.

Alma Swan, in a report well worth reading, posits four kinds of data-curation staff: data creators, data managers, data librarians, and data scientists. I’m not sure how far I can go with that. I agree with the skillsets as Swan lays them out; I’m just agog at the idea that any institution or research shop will be able to divvy up these tasks among four whole people!

It doesn’t help Swan’s case in my mind that I myself am half a data librarian and half a data manager. (Swan says that “the boundaries are fuzzy,” but I’m not sure there are any boundaries at all!) Munging data to make it flow from one place to another? Been doing that these ten years. Looking after digital data as best I can to keep it usable for the long term? Sure. That’s what an institutional-repository manager is for, right?

What I’m afraid of is that Swan has reified the job descriptions too soon, and that eager institutions will say to themselves “this! this is what we need!” before they do the hard work of making internal decisions about which pieces of the e-research puzzle they can and should assemble.

Make no mistake, it’s okay not to do everything. If you’re going to focus on big science and leave everybody else gasping, that’s your choice; it may even be the only choice that pencils out, budget-wise. Personally, I’m of the opinion that big science doesn’t need anybody’s sloppy help for the most part, and the interesting problems are found elsewhere, but that’s me.

The point is, I don’t know and neither do you?and neither does Alma. It’s too soon.

So how do we confront a big nebulous problem? By opening the door to happy accidents, I think. The consulting model for e-research in place at institutions like Purdue makes a lot more sense to me than some other approaches I’ve seen. You can do endless surveys or focus groups to figure out what institutional needs are, but there’s nothing quite like throwing spaghetti at the wall and seeing what sticks.

Sure, it’s a risky strategy. It has the virtue of being cheap, however, compared to buying a lot of big iron before the local need is established. It also assumes that you have the right kind of people on staff?can-do souls comfortable with a lot of uncertainty and able to learn fast. Staffing a speculative consulting operation with the change-averse is a fast road to oblivion.

We’re all accidental informaticists. We’ll all have to learn by doing. That’s okay. Let’s do it!