Tactics
BMC Bioinformatics published this article describing a "data publishing framework" for biodiversity data.
Stripped to its essentials, this article is about carrots for data sharing. Acknowledging that cultural inertia (some of it well-founded) militates against spontaneous data sharing, the authors suggest a way forward.
I'm calling this one out because it has implications for storage-system design. The authors want three things for their public data: persistent identifiers, citation mechanisms, and data usage information.
(For once, I feel good about institutional repositories: they swing…
Libraries do collaborative collection development, through consortia and increasingly via direct institution-to-institution arrangements. Reference and instruction are collaborative endeavors—look at any social-networking service with lots of librarians and you'll see on-the-spot crowdsourced reference responses.
Perhaps this collaboration instinct will help libraries respond to the challenge of domain expertise for data curation. Do I need to know cheminformatics, or do I just need to buy a cheminformaticist conference potations until I secure her business card?
Formalizing expertise-sharing…
I read the RIN report on life-sciences data with interest, a little cynicism, and much appreciation for the grounded and sensible approach I have come to expect from British reports. If you're interested in data services, you should read this report too.
A warning to avoid preconceptions: If you pay too much attention to all the cyberinfrastructure and e-science hype, it's very easy to fall prey to the erroneous notion that most of science is crunching massive numbers via grid computing and throwing out terabytes of data per second.
It ain't so. It never was so. Will it be so in future? Not…
I pointed out Mike Lesk's slideshow in my last tidbits post, finding it a good critical précis of the data problem. It's pleasantly aware of human problems, human problems many treatments of cyberinfrastructure (including, unfortunately, this otherwise useful call to action from Educause) wholly ignore.
So wince and flinch at the design (black Arial on white? really? in 2009?), but read the slideshow anyway.
I do want to pick apart the slide from which I took the title of this post. I reproduce the said slide's text in full:
Can we just give the problem to the libraries?
As a professor in a…
I've lived all my short career in academic libraries thus far on the new-service frontier. In so doing, I've looked around and learned a bit about how academic libraries, research libraries in particular, tend to manage new services. With apologies to all the botanists I am about to offend by massacring their specialty, here is my metaphor for the two main courses of action I see: grafting the new service on like an apple branch to a crab-tree, or hybridizing the new service with existing services, thus changing the library from the ground up.
Each approach works in some situations, it seems…
I'm still buried in translating a presentation into Spanish for Monday and finishing another in English for Wednesday, but here's a small thought to tide folks over, a thought that came to me shortly before my presentation at Access.
At the data-curation workshops I've been to, it has been axiomatic that "we can't afford to keep it all." Some fairly sophisticated judgment rubrics have been worked up, often based on the same kinds of judgment calls that special-collections librarians and archivists make when presented with collection opportunities. Is this dataset unique, or could it be…
Roy Tennant sent me an email about my Access presentation in which he asked what libraries should do about the laundry-list of data-curation challenges I presented. (If you're curious, you can go view the presentation yourself, courtesy of the wonderful A/V folk at Access. The less-than-an-hour-long way to assimilate the same information is to look over slides plus talk notes on SlideShare.)
That's an eminently fair criticism. I've been thinking about it since receiving the email. I think the answer for libraries is to set their own digital houses in order first thing. After all, how can we…
In many of the data-curation talks and discussions I've attended, a distinction has been drawn between Big Science and small science, the latter sometimes being lumped with humanities research. I'm not sure this distinction completely holds up in practice—are the quantitative social sciences Big or small? what about medicine?—but there's definitely food for thought there.
Big Science produces big, basically homogeneous data from single research projects, on the order of terabytes in short timeframes. For Big Data, building enough reliable storage is a big deal; it's hard to even look at the…
I commented here earlier, not without frustration, about a pair of researchers who built and abandoned a disciplinary repository. I was particularly annoyed that they seemed to have done this purely for self-aggrandizement, apparently feeling no particular attachment to the resulting repository.
Such as they should not open repositories. Neither they nor any service they offer is trustworthy. I hope that's uncontroversial. Unfortunately, even vastly better intentions than that don't guarantee the sustainability of the result, even in the short term.
The Mana'o anthropology repository, started…
Many doctoral institutions now accept and archive (or are planning to accept and archive) theses and dissertations electronically. Virginia Tech pioneered this quite some time ago, and it has caught on slowly but steadily for reasons of cost, convenience, access, and necessity.
Necessity? Afraid so. Some theses and dissertations are honest digital artifacts, unable to be faithfully represented in ink on paper or in other analog fashion. Others might be flattened into analog, but that wouldn't be their (or their author's) preference. Still others contain digital artifacts of various sorts.…
Many of my readers will already have seen the Nature special issue on data, data curation, and data sharing. If you haven't, go now and read; it's impossible to overestimate the importance of this issue turning up in such a widely-read venue.
I read the opening of "Data sharing: Empty archives" with a certain amount of bemusement, as one who has been running institutional repositories in libraries for four years. I think Bryn Nelson has confusingly conflated different notions of "data" in his discussion of the University of Rochester's IR.
By the definition Nelson appears to be thinking about…
I said awhile ago that we don't know who's going to do data curation yet. I absolutely believe that.
I probably should have added, though, that we can have a pretty good idea who's not going to do it: anybody who isn't right this very minute planning to do it.
Make no mistake, there's money (from funders and institutions) and hard-won relevance to be had in this line of work. Quite a few people and organizations are eyeing it: IT, libraries, scholarly societies, journals, entrepreneurs.
If you want to get into the scrum, if you want a piece of the pie, better get your plan on now. This is no…
The publisher Information Today runs a good and useful book series for librarians who find themselves with job duties they weren't expecting and don't feel prepared for. There's The Accidental Systems Librarian and The Accidental Library Marketer (that one's new) and a whole raft of other accidents.
I suspect "The Accidental Informaticist" would find an audience, and not just among librarians.
The long and short of it is, we just don't know who is going to do a lot of the e-research gruntwork at this point. Campus IT at major research institutions is seizing on the fun grid-computing work,…
I gave a talk for PALINET some little while ago about institutional repositories. The audience had been primed by the fantastic Peter Murray to think about looking after digital content as the "fourth great wave" of library work. (I wish that talk was online. It was absolutely brilliant.)
But not everyone was entirely onboard with that. I recall distinctly one distinguished-looking white-haired gentleman raising his hand. "We in libraries," he said (paraphrase mine), "have historically been purveyors of quality information. Authoritative information. On what basis should we jeopardize that…
Unconnected incidents are making me ponder questions of sustainability. I don't have any answers, but I can at least unburden myself of some frustrations!
I learned from a colleague that arXiv is looking for a new funding model, as Cornell is wearying of picking up the entire tab. Various options are on the table, and I'm not competent to opine on their feasibility. I'm more interested in the larger question: how are we, we libraries and we researchers, organizing to shoulder the burden of electronic archives, especially open-access ones?
Historically, the answer has been "not effectively." I…
Another thing I meant to call out in the context of the Jupiter-goes-boom event was the nod to data gathered by people who aren't connected to the formal research enterprise save tangentially.
This event was first noted by someone not an astronomer by profession, and the article notes that this is hardly the first time astronomers have been scooped. My husband, who is an extremely amateur skygazer and likes to hang out on online astronomy bulletin boards, says that his impression is that astronomers mingle with enthusiasts fairly freely, all things considered, and both sides appear to benefit…
And we're back! (With a four-note theme. Wait, that's Peter Schickele on Beethoven. Never mind.)
So yesterday before our enforced break, I asked what we could learn about e-research from a big chunk of space flotsam hitting Jupiter. What had caught my eye was this passage:
… the planetary astronomy community has been filled with excitement—emails are flying, with people exchanging information about the new discovery and its development. Major observatories are canceling their scheduled observations so that they can point their telescopes at Jupiter.
Why are they doing this? Because this is…
Lively welcome here at ScienceBlogs, I must say. Two posts, a soft launch, and eighteen comments already!
The comments have turned up a question deserving of further discussion. On my first post, commenter Jim Lund said:
E-research? Why make a distinction? Today there's only e-research and archaeology. :)
And on my second, commenter rnb said:
Computers have been used to investigate circuit behavior since I was in college back in the 70s. So should engineers be called e-engineers?
Not trying to put words in their mouths here, but it seems to me they're getting at the same question about how…