Dividing up the pie

By dsalo on July 28, 2009.

Another thing I meant to call out in the context of the Jupiter-goes-boom event was the nod to data gathered by people who aren't connected to the formal research enterprise save tangentially.

This event was first noted by someone not an astronomer by profession, and the article notes that this is hardly the first time astronomers have been scooped. My husband, who is an extremely amateur skygazer and likes to hang out on online astronomy bulletin boards, says that his impression is that astronomers mingle with enthusiasts fairly freely, all things considered, and both sides appear to benefit.

Astronomy isn't the only field where this happens, of course. The Center for History and New Media projects I mentioned in my previous post are essentially crowdsourced news-gathering turned into history. When I was a graduate student in linguistics back in the day, I had occasion to look at Mayan, which amateurs have been instrumental in deciphering. Birdwatchers no more skilled than I are of material help to ornithologists in providing localized bird counts and similar observations. I am also seeing some renewed excitement about "crowdsourcing" various scientific tasks that can't be done by computers but are too laborious and time-consuming to assign to researchers.

So my question about all this is… who's looking after their data? Do data have to come from an accredited scientist affiliated with an institution before they are worth preserving?

Sometimes these questions have answers. Sometimes, not so much.

This points to a larger question, an elephant-in-the-room question. Whose responsibility is all this data gathering and preservation, anyway? "Individual researchers" is an inadequate cop-out, let's just get that on the table right now; without sustainable support, data die when grants fade or retirements happen.

This leaves a few possibilities: funders (notably government), disciplines, and institutions. None of them is unproblematic—in fact, I would go so far as to say that none of them can solve this problem unaided.

Relying on funders assumes that funders will take a long-term perspective on sustainability. Funders can be fickle about this, even government funders; witness the troubled trajectories of the ERIC education database in the US and the Arts and Humanities Data Service in the UK. Worse, outside government vanishingly few funders have resources and infrastructure to throw at this problem; the most they can do is throw money at it in the form of grants, which is not a sustainable funding model by any means.

The line between disciplines and institutions is often a fuzzy one, honestly. The arXiv is the paradigmatic disciplinary preprint repository—but it is sustained by the Cornell University Libraries. Things were not always thus, but such a handoff isn't exactly unusual.

However. When you ask a researcher about her "discipline," she'll probably start talking about her favorite scholarly society. Where are the scholarly societies in all this ferment about data? Gosh, wish I knew. We'll just pass by the American Chemical Society in silence, shall we? They're an outlier and we should all be glad of that… but where's everybody else? Looking for services that members need? Materials that keep members coming back to the society? Why aren't scholarly societies in the data business? I wonder.

Institutions. Institutions have a built-in challenge dealing with data: they have to deal with it over a wide swathe of disciplines. I can't emphasize enough how hard that is! Different formats, different metadata standards (where there are any at all), different ontologies, different patterns of thought, different workflows… there's just no end to the differences.

In these early days, I see a few different institutional approaches to this problem. One is "follow the money." If you've got million-dollar grants, you'll get red-carpet treatment. No grants? No service. When this model is accused of inequity, it throws its hands up and says "since when was life fair?" Another approach is what I call "help the First Son." In the Pesach parable, the first son is the one who approaches his father asking detailed and intelligent questions about Pesach observance, and receives detailed and intelligent answers.

I don't know about you, but I don't know many First Sons among researchers. A few, yes, but not many. A lot of the researchers I know are Third Sons. "What is this?" they say. And a lot are Fourth Sons, who do not even know how to ask. A First-Son approach leaves our Third and Fourth Sons with no answers.

So what we're left with, when we ask who's responsible for data, is a big muddle. Some disciplines have this pretty much sorted. For them, institutional support may be redundant. Other disciplines are under the funder gun; it's still unclear what the institutional role will be there. Many researchers fall into neither group; either their institution helps them or they get no help.

My worry is that as the pie is currently divided, a lot of researchers aren't getting any.

More like this

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

We're moving!

August 3, 2010

Looking for us? We're happy to say that we're part of the new Scientopia blogging collective. Come see us there!

Belated Zombie Day post

July 13, 2010

Oh, if I'd only had this picture for Zombie Day... Credit for the photo to UK Serials Group. Credit for the alteration of the speech bubble (you can see the original slide here if you care to) to Steve Lawson. Incidentally, I should have a postprint of an article based on this presentation up…

Promoting a comment: "Open and shared format"

July 8, 2010

Richard Wallis has taken my ribbing in good part, which I appreciate; his response is here and will reward your perusal. He also left a comment here, part of which I will make bold to reproduce: As to RDF underpinning the Linked Data Web - it is only as necessary as HTML was to the growth of the…

Small fry, blogging networks, and reputation

July 8, 2010

So, the PepsiCo blog thing. Right. Advance disclaimer: this is me talking, not either of my illustrious co-bloggers. We have not yet made a decision about what to do; one co-blogger is across the pond at a conference and the other is vacationing, so that discussion will have to wait a bit. This is…

I'd love to dance with you, but...

July 6, 2010

Richard Wallis of Talis (a library-systems vendor) posted The Data Publishing Three-Step to the Talis blog recently. My reaction to this particular brand of reductionism is… shall we say, impolitic. I just want to pat Richard on the head and croon "Who's the clever boy, then? You are! Yes, you are…

More like this

We're moving!

Belated Zombie Day post

Promoting a comment: "Open and shared format"

Small fry, blogging networks, and reputation

I'd love to dance with you, but...

It from Bit: Is the Universe a Cellular Automaton?

REVEALED: Disgraced antivax "scientist" Andrew Wakefield met with Donald Trump in August to promote his "CDC whistleblower" conspiracy movie

Unusual creature spotted nesting in Ballard