Open science, openly arrived at

As an academic researcher I don't write grant proposals for a living, although sometimes it feels like I do. I need grants to do my work, but I also need to get to work and I don't consider myself to be commuting for a living. Although sometimes it feels like I do. Having said that, low on my list of favorite things would be anything that required even more compliance paperwork for a grant proposal, but the National Science Foundation (NSF) is now about to spell out a new compliance paperwork requirement, and frankly I approve of it. In principle, at least, although I won't like doing it if it spreads to my own granting agencies. NIH already has something like it and it needs to do more, even if I'll hate doing it.

We are talking about something euphemistically called a "Data Management Plan." It's really an open data access initiative for products paid for by the taxpayer and it's overdue:

Scientists seeking funding from the National Science Foundation (NSF) will soon need to spell out how they plan to manage the data they hope to collect. It's part of a broader move by NSF and other federal agencies to emphasize the importance of community access to data.

[snip]

NSF wants to avoid a one-size-fits-all approach to the issue, [Edward Seidel, acting head of NSF's mathematics and physical sciences directorate] explained, because each discipline has its own culture about data-sharing. "A scientist might say that my plan is that I don't need one, because I don't save my data," he told the board committee, which has just formed a task force on data policy. "The important thing is that it puts people on notice that they have to think about it, maybe for the first time." NSF Director Arden Bement said he expects that some applicants will request additional funding to implement their data management plan, making it another factor that reviewers will need to take into account in weighing the value of a proposal. (Jeffrey Mervis, ScienceInsider)

Sounds good, although it doesn't sound specific enough or tough enough to me. Yes, each discipline does have its own culture about data-sharing. And in many cases that culture has to change. We've inveighed often here about the shameful practice that many senior and well-respected flu scientists have of keeping their sequences private until they publish -- if they publish using them. If not, no one gets to see them, even if we paid with tax money to collect them. The motives are often unselfish -- a senior scientist trying to protect post-docs or grad students from being scooped. Very Old School. This is the 21st century. We have our own students and we take mentoring very seriously. And one of the things we teach them is that if they have information of importance to public health, then it is to be made public. You don't make any deals with anyone that you will keep it confidential. Period. And you don't keep hold of it on your own initiative, either. Influenza virus sequences are matters of public health importance. If you are worried your career or the career of your students or post docs will be harmed by releasing them as soon as practicable, then you are in the wrong field. Choose a field or a virus where it doesn't matter. But keeping those sequences private is part of the "culture of the discipline." And it needs to change.

As an epidemiologist it can take me years of hard work to collect data. I want to use that data and reap its benefits, both for public health and for me personally and my students and post docs. That doesn't mean I get to hoard them. It means that I have to use them in a timely way. I have an advantage over everyone else because I know the data better than they do and I have it before they do. But I don't have any ownership rights over it. If someone else can use my work, that's what science is all about. Making it available and accessible should be part of the culture of my discipline. It isn't, sad to say. But what should also be part of the culture is that if I use someone else's data (or vice versa), data made accessible to me by virtue of a granting agency's policies, I should give full credit to those who collected it and that credit should count in terms of academic appointments and promotions.

I know there is genuine resistance and resentment about this among my colleagues. I think it's a losing battle for them, however, and I hope they lose it sooner rather than later. Science will be better off. The internet allows public access to an unprecedented degree, and the amount of "raw brain power" out there is quite incredible. It is not inconceivable to me that access to data could allow some amateur scientist or very smart lay person to make an important scientific discovery.

Open science, openly arrived at. That's the culture I'm talking about.

More like this

"it doesn't sound specific enough or tough enough to me".
I suspect it's hard to make rules valid for every discipline. There should be different rules for different panels.
For example, I'm a mathematician. I don't have any data. I'm not doing experiments, ever.
On the other hand, I know that a lot of more-or-less ready material in my field circulates informally before its arxiv release, sometimes for 6+months, or even years. This gives an unfair advantage to the "friends of" who get to see said material before the rest of the community. Maybe a rule that says "you can't keep your papers hidden" would seem as weird to you as your preoccupation with data is to me.

estraven: You are right, of course, that there are differences. My son is a mathematician and I hang out in math departments quite a bit. The LANL preprint arXiv really blazed the trail for the rest of us, but it is pretty generally available if you know about it. And it happened because it took you guys so long to referee papers. And mathematicians used to be terrible about keeping their stuff secret (think Tartaglia and friends, not to mention Newton). Times change and the post was about changing times.