Open Notebook Science

I know, I know, many people are still skeptical, but opening one's lab notebooks is a part and parcel of the new world of Open Science. There is an opinion piece about it in Nature (also available on Nature's Nautilus blog). Attila Csordas added some very important points today, reminding everyone of the global nature of scientific collaboration.

The few pioneers who have opened their notebooks do it in different ways. Jean-Claude Bradley's group uses both a blog and a wiki. Rosie Redfield's group has one central blog plus each student's own blog (see them here, here, here and here). Bjoern Brembs certainly has a cool blog, though it does not function as a day-to-day lab notebook. His website is full of information, often on things not published yet, e.g., this page showing data before the paper came out. Attila is right that whatever software or method gets universally adopted has to be useful for collaborations between geographically distant labs, not just within labs located in one building.

Maxine states that:

...maximizing their benefits will require a change in culture that many researchers will no doubt initially resist.

. But, as I mentioned before, the areas of science that are really competitive (for patents, money, prizes or fame) are those that are always in the news and the best kinds to use where conflict is needed for the plot: in the movies and in LabLit. Those are the areas that contain the fear of scooping. But those areas are actually relatively small. Most of science is outside the limelight, populated by very gregarious and very generous people following their own curiosities. Most of science outside the "hot" areas (like cancer research) is already collaborative, not competitive, and people in those areas are most likely to be the first to adopt some kind of online Open Notebook style of collaboration.

More like this

Earlier today I went to UNC to talk about Science On The Web in Javed Mostafa's graduate course on Enabling Usability of Cyberinfrastructures for Learning, Inquiry, and Discovery. I showed and talked about the following sites: The rapidly growing List of Open Access journals and how the recent NIH…
Maxine Clark, Attila Csordas, Deepak Singh, PZ Myers, Pedro Beltrao, Jean-Claude Bradley, Pierre Lindenbaum, Peter MR, Andrew Walkingshaw, Anna Kushnir, Timo Hannay, Richard Akerman and yours truly are some of the 200 people invited by Google, Nature and Tim O'Reilly to participate in this summer's…
Jean-Claude Bradley and I first met at the First Science Blogging Conference where he led a session on Open Science. We then met at SciFoo and later joined forces on a panel at the ASIS&T meeting and finally met again at the second Science Blogging Conference back in January where Jean-Claude…
Jean-Claude Bradley is the pioneer in the use of blogs in science in the way that too many of us are still too scared to do - posting on a daily basis the ideas, methods and data from the lab. He and his collaborators are using the blogs Useful Chemistry, Useful Chem Experiments 1 and Usefulchem-…

Of course, I would take a job in a cancer research lab right before I started to learn about Open Access/Open Science...

My boss, who has spent his whole career in cancer research, has a bunch of horror stories about theft of ideas and data. He remains convinced that the scientific community Asshole Count is sufficiently high that, basically, if you make something steal-able someone WILL steal it. Every time, no exceptions.

I hope you're right, that less competitive areas of science will lead the way.

Unfortunately, the scoop-fear engendered by the horror stories in a couple of hot fields tends to percolate down and infest the areas of science where no real threat of scooping exists. It will probably be the very least competitive areas that will go open the first - that is also one way to make the world know they even exist! The Long Tail effect in a way, coupled with a need for self-promotion outside of normal channels when neglected. Hopefully, the others will then slowly follow suit...

Thanks for this post! As a senior graduate student who'd like to be a PI one day, I'm really interested in the implementation of blogs and wikis as a means of lab communication.

Oops, it turned out that I wrongly assumed that the author of the editorial is Maxine Clarke, so please correct this.

Maxine Clarke's comment on Pimm: "Thanks for the link, Attila. I should just point out that Im not the author of the article. The article is an Editorial in Nature which I featured on Nautilus as that is where we ask scientists to comment on policy issues before we decide whether or not to make something a policy eg data availability."

I'm all for open datafiles -but done post-publication.

If datafiles are available online before they are interpreted by the scientific teams that collected them I think you'll encourage freeloading. By this I mean that scientific teams who did not collect the data will be able to garner recognition/reputation for interpreting data that is not their own before the original team has had a chance.

This is problematic because interpretation of data and assimilation of that new information into the corpus of knowledge is usually associated with greater recognition than the collection of that data. That's why research assistants collect the data and PIs are quoted in the press releases. It's also why writing the editorial about a paper being published is regarded as a mark of prestige. You get to explain it to others.

Watson and Crick offer a good example. Whilst I'm not calling them freeloaders they didn't actually collect the data. They did however correctly interpret it and won the resulting Nobel.

The people who put the work into collecting the data ought to be be allowed first crack at interpreting it. Often this takes a bit of time to stop and think about what the data is suggesting before simply publishing it. Open peer review at that point seems reasonable to me.

Publishing raw data on the other hand MAY encourage freeloading which is likely to cause a scientific 'tragedy of the commons' effect. Why collect any data when you can just troll the 'net interpreting other's data for them and gaining all the cred? At this point why bother collecting data if it just gets 'stolen' everytime?

No new data or a slow down in the collection of data and science suffers. We're back to sitting around thinking about the problems of nature without reference to any data. The philosophers have already tried that.

Yes, if only a few people make their data free. But in a system in which everyone makes data available, there will be both less incentive to cheat and a stronger mechanism to punish cheaters. The transitional period between closed science and open science (i.e., when some do and some don't) will be the difficult phase.

I'm talking about the tragedy of the commons where the raw data becomes common property and there is no incentive to supply it.

I'm worried about a long term decline in the collection of data.

Hello, thanks for linking to this Editorial (which is free access in Nature as well as on the Nautlius blog), and for the interesting comments above. Our policies at the Nature journals are that data and materials need to be free, but we are aware of some of the problems raised above, too. We'll continue to listen.

In times of ever more limited funding and more and more competition, open science will not emerge. Researchers now have to fight not only for scientific results, but for their own livelihoods and that of their families.
The more funding gets cut, the more it needs to be restricted on topics deemed important. More researchers will accumulate in such "hot areas", making them even more important. Nothing will be shared in such a situation, but instead you will see a rise in scientific misconduct.
Even though you mention my site as publishing data before publication, the data cannot be used for evaluation. Not even in my comparatively unimportant area would I expose unpublished results in the current research climate.
Open Science is the way research should be, but as long as your family's survival depends on it, it won't happen.

Open Science is the way research should be, but as long as your family's survival depends on it, it won't happen.

Then it's up to us -- young scientists who can see the value and potential of Open Science -- to change the system. As we make our way up the food chain, we can begin to change the reward structure: recognize blogs as a way to communicate, support Open Access publishing, encourage Open Science wherever possible.

As we make our way up the food chain, we can begin to change the reward structure:

We won't make it up the food chain if we publish in journals with no or low impact factor or get scooped by letting people look into our data or research strategy.
If you don't want to end up as Dr. Taxi-driver or Dr. McDonald's-burger-flipper, you've got to play by the rules. That's what I hate about our job more than anything.

Should we apply game-theory to this? Data-sharers as helpers and data-hoggers as cheaters (or should scoopers be cheaters)? Right now, most people are cheaters. What is the way to get to the point where most people are altruists and cheating is harshly penalized? Isn't scooping limited only to a very small subset of very hot research where much is at stake, e.g., cancer research, while 99% of scientists are gregarious and openly share data at meetings and seminars anyway? Does hot research area attracts potential cheaters more than other areas of science?

Game theory definitely applies here and probably even more so than you imply! The problem is that not only is funding increasingly restricted to "hot" topics, due to budget cuts (making essentially all ongoing research a "hot" topic), but much more than before does this competition lead to hyping of "cold" topics in order to get funds. In the worst case, as a "cold" topic researcher is tempted to m(f)ake his topic "hot" with some surprising finding - not for fame or recognition, but to put food on the table. Few people are willing to fabricate data for fame but many more if it helps them to survive. Thus, if there is currently an unprecedented incentive to fabricate data, how much less incentive does this current climate provide to share data?
Indeed game theory applies exquisitely to the research community and currently all incentives are set to "cheat", yielding "all defect" as the only viable ESS. It's about time someone besides us realized this. Actually a few do, see links in this post:
http://bjoern.brembs.net/news.php?item.154

I find applying scientific techniques to science politics is a very valid topic worthy of its own thread.