Scooped

Why do people worry about being scooped, how do scoops happen, how can there be rumours of discoveries and why isn’t the whole process made transparent?

Asked a commenter a few weeks ago during the speculation about Greg Laughlin’s teasing anagram hint about a pending discovery or new result.

Why do academic scientists worry about being scooped?
Are there situations where it is not a concern? And if so, why the secrecy?
And why isn’t there a way to publicly record precedent without the games and rumouring?

The essence of the scoop is someone working on a result, and a competing group publishing the essentially same result before them.
In blind scoops the teams were unaware of each other; it gets nastier when one team is unaware of the competition, but the second team knows of the other teams preliminary results or general research direction.
Good clean races occur when both sides know there is competition.

A lot of scientific discovery is not sensitive to scooping, it is incremental and collaborative work with many people sharing credit over time. In a lot of other cases people might worry about being scooped, but as it happens no one was actually in a position to do so and they need not have worried.

There are many categories of discovery: some are the results of major long term efforts using unique facilities; some are elegant pieces of bleeding edge research using new techniques or insights, but which could be done by any of several research groups should the choose to focus on that path; some are “obvious”, after someone else thought of it; some are obvious, but hinge on proprietary data; and some are the “wish I had done that” – serendipitous discoveries that any number of people could have made if they had only been looking there.
The possibility of a scoop exists for all of these categories, and have happened, or could have happened.

For example: Penzias and Wilson scooped Dicke’s cosmology group at Princeton in finding the cosmic microwave background. It was a discovery that required near unique facilities, the Princeton group was doing a purpose built experiment while AT&T made the discovery serendipitously, and earlier. As I heard the story, a theorist who heard a presentation by the AT&T group told them about the cosmological speculation for a CMB and they ran away with it. A painful scoop.

A lot of stuff in genetics and biochemistry seems to be in the second category. Discoveries that require hard work by skilled teams and good equipment, but for any given breakthrough there are several groups who could have done the same work, but were too slow or chose a different direction. Scoops there are frequent and inevitable.

I guess I’d say thermostable DNA polymerase was “obvious” after it was discovered.
Another case is the famous M_bh-sigma relation, but let us not go there.

A lot of observations done by national facilities are “obvious”, but the data rights are (usually temporarily) given to the group that put together the best proposal for taking and analysing the data. A lot of Hubble discoveries are like that – there are targets several groups propose for but only one group may get the data, based on their observing strategy, and they have exclusive rights to it for a year.

Transiting planets are a “wish I had done that” discovery. A lot of people could have made the discovery, the equipment was and is widely available, with a bit of hustle and a lot of hard work chasing after targets. Those situations are where a scoop is likely and some delicacy is required to get the data and get out the results without someone cutting you off on the inside.

So, how do scoops happen?
Sometimes it is honestly a case of happenstance in timing.
One case I was involved in, we wrote a Science paper based on some Hubble data that was no longer proprietary. The author combination was serendipitous, literally a matter of who sat down together at a table for dinner, and we pulled together the associated data and got a paper out fast.
After the paper was accepted, but under embargo, I got an e-mail from a graduate student who had noticed some interesting x-ray data, and found there was matching Hubble data which showed something interesting – and his advisor thought he should ask me if he was onto something. And I had to tell him we already had a paper in press on this, and that we had come upon it from the other side as it were.
He was scooped, and we were almost scooped.
Any similarity to a famous Asimov “detective robots” short story is coincidental…

The nastiest scoops come from the competing teams with bleeding edge projects.
A genuine scoop, as distinct from merely coming second in a discovery race, comes when the competing team learns the other team is near publication, and they decide to cut a corner and go public more rapidly, possibly with less conclusive results, but still adequate, particularly after the other team “confirms” the discovery with later, more thorough data.
I won’t try to pick an example, those get too nasty and sometimes ambiguous to the outsider.

The temporarily proprietary data situation can also set up scoops.
Mostly those are a matter of the data owners not getting the results out and someone cherrypicking the data for a quick result when it goes public. This is usually a feature, it is meant to get the data owners motivated, but sometimes this leads to scoops. I was once on a paper using Hubble data, where a referee kept slowing things down and asking that we remove interpretation from the paper and focus on just the data analysis. On the nth round of this, a preprint appeared interpreting our data, just as they became public… that one we called in to the authorities and some improprieties were unveiled.

The unique facilities people are least likely to be scooped – places like LHC or LIGO.
Curiously they tend to be the most tightlipped and cautious on results – the fear there is the retraction – since they can not be scooped they must be certain of the result and not risk the embarrassment of the retraction. So, slow and careful.
This was not always the case, in the past, for example, there were on occasions when there were duplicated facilities and this lead to scoop situation such as the J/Psi case.

So what of the “anyone could do this, if only they knew” discoveries – those are easy to scoop. How do those scoops happen and can they be safeguarded through something less elegant but more straightforward than an anagram on a blog…

So, results leak before publication.
For one thing the referee, and editor, see the results. They mostly respect confidence but referees are human and do talk.
There are also proposals – to get funding, or observing time, the preliminary results and expected progress must be revealed, and the reviewers then know.
That is a potentially nasty situation and I have heard strong assertions of scoops both by referees and reviewers. Those are generally unprovable, especially for “obvious” discoveries, since the referees and reviewers are generally experts in the field and may genuinely already have been working on the problem.
More broadly, people tend to discuss results before publication at meetings, over beer and with trusted colleagues, so rumours spread.
Sometimes the mere rumour is enough to trigger a scoop, since the hint can be enough to put someone on track.

Now, here timelines become an issue: precedence generally goes by publication date.
Submission date is recorded, but rarely helps in the war over precedence.
But sending a preprint to arXiv before publication will easily overrule both submission and publication in perception of precedence.
Sometimes the “first” discoverers just get slowed down by the process, the referee is slow, maybe deliberately so to let allies push their competing result through faster; maybe the other journal just does faster turnaround, some journals specialise in that.
And, sometimes, other groups are just faster or more ruthless – a lone scientist at a small college, or with a single grad student can easily be outraced by a large group at a well funded institution with a PI who can throw postdocs and multiple students at a problem, once it is known to be interesting. The competition can also, as I noted before, decide to try to publish with a less conclusive preliminary result, hoping it is adequate for precedence, particularly if the competition is prone to perfectionism.

So, scoops happen.
Mostly they suck.
Sometimes they don’t matter, “everyone knows” who really did it (actually in those cases it is usually the people outside the field who get the misperception of precedence). Sometimes it matters a lot and things become very bad within a subfield for a while.

Is there a process to short cut this – some sort of idea registry, or a depository of research progress (other than the legendary “lab book” with the date and time scribbled in #2 pencil at 3 am)?

I don’t know.

Idea registries would probably become like modern patents, a small number of people tossing of lots of ideas to claim precedence but without actually doing the hard work.

A “depository of progress” might work – maybe a trusted public web site with encrypted claims, and a one time keypad that was PGP encrypted or something.
So if a claim needed to be asserted a research could reveal a key phrase encoded via PGP that would decode a timestamped research progress claim.
Easy to set up, cheap to maintain, don’t know who would bother.
Though I can see some biogeneticspharmboys needing it.

It is an imperfect system.

Comments

  1. #1 Steve
    August 20, 2008

    The anagram approach was used by Robert Hooke to establish precedence for his discovery of the law of elasticity. He published “ceiiinosssttuv” which unravels to give “Ut tensio, sic vis” meaning “As the extension, so the force”.

  2. #2 Gilad
    August 20, 2008

    Actually, the required elements exist with something slightly better than PGP and a one time pad – modern crypto has what is known as “commitment schemes” (see http://en.wikipedia.org/wiki/Commitment_scheme). If you’re really concerned about being scooped you can try and use one of those.

  3. #3 Ben
    August 20, 2008

    Penzias and Wilson didn’t exactly run away with their scoop. They had already discovered the noise before Dicke’s Princeton group had finished their telescope. The story usually goes that they talked to someone at MIT who told them about the CMB and to talk to Bob Dicke. Dicke visited them and confirmed that they had discovered the CMB. The Penzias and Wilson discovery paper (which has the most beautifully deadpan title in the history of astronomy) was published back-to-back with the Dicke et al theory paper.

  4. #4 Ben
    August 20, 2008

    Scooping is annoying. I think what’s most annoying (apart from actual unethical, as opposed to just ruthless, behavior) is that anyone cares. None of us can keep up with the pace of the literature now anyway, so except for the most world-shattering discoveries, what does it matter if Group A was on astro-ph or in press a month before Group B?

    On the other hand, there are people out there who, while not doing anything that is outright unethical, specialize in writing the firstest sloppy paper on a subject, stealing a bone from under some poor dog’s nose and leaving the careful plodders to clean up the resulting mess and get no glory from it. I don’t think there is any kind of pre-publication registry that would solve the problems, it would just multiply the amount of vaporware results. The only thing that would help is if the community valued careful work and downvalued the sloppy papers. Yes, we are supposed to do this, but in a world short of time, I don’t think it happens enough.

  5. #5 Shantanu
    August 20, 2008

    Also, Igor Novikov and collaborators had predicted (before CMB discovery) that the Bell labs antenna would be best suited for the discovery of CMB.

  6. #6 Steinn Sigurdsson
    August 21, 2008

    Penzias and Wilson was a scoop, because they found the CMB serendipitously before a team that was deliberately trying to measure it did, and the Dicke team would have measured it, given a bit more time.

    What Novikov paper discusses the CMB before Penzias and Wilson published?
    Is it the 1964 paper with Doroshkevic? I do not have a copy of that. I note Bharat cites it in his recent review.

    Ah, I see – Naselsky, Novikov and Novikov lead off on this in their book.
    I’d like to see the original paper…

  7. #7 Ben
    August 21, 2008

    I agree that Penzias and Wilson scooped the Princeton group; my point was they didn’t run away with it by talking to some third party and rushing it into the literature, catching the Princetoons unawares. That might happen now, but it was a different time then, I suppose.

  8. #8 Shantanu
    August 21, 2008

    I agree it was a scoop. I think Bob Dicke mentioned this to his group
    “We are scooped”. Also I don’t think the Bell Labs group were aware of the Novikov paper.
    Also which review paper by Bharat (Ratra ?) are u referring to?
    thanks

  9. #9 Steinn Sigurdsson
    August 21, 2008

    yeah. I’ve heard a first hand account that after he took the phone call he told Peebles and Wilkinson “boys, we got scooped”.
    Novikov agrees there is no way Penzias and Wilson would have read Sov Phys Dok, and Penzias and Wilson separately said they were unaware of it until much later.
    Dicke’s group had also not seen it.

    The paper by Bharat is Ratra & Vogeley PASP 120 235 (2008)
    Nice concise invited review.