Why do people worry about being scooped, how do scoops happen, how can there be rumours of discoveries and why isn’t the whole process made transparent?
Asked a commenter a few weeks ago during the speculation about Greg Laughlin’s teasing anagram hint about a pending discovery or new result.
Why do academic scientists worry about being scooped?
Are there situations where it is not a concern? And if so, why the secrecy?
And why isn’t there a way to publicly record precedent without the games and rumouring?
The essence of the scoop is someone working on a result, and a competing group publishing the essentially same result before them.
In blind scoops the teams were unaware of each other; it gets nastier when one team is unaware of the competition, but the second team knows of the other teams preliminary results or general research direction.
Good clean races occur when both sides know there is competition.
A lot of scientific discovery is not sensitive to scooping, it is incremental and collaborative work with many people sharing credit over time. In a lot of other cases people might worry about being scooped, but as it happens no one was actually in a position to do so and they need not have worried.
There are many categories of discovery: some are the results of major long term efforts using unique facilities; some are elegant pieces of bleeding edge research using new techniques or insights, but which could be done by any of several research groups should the choose to focus on that path; some are “obvious”, after someone else thought of it; some are obvious, but hinge on proprietary data; and some are the “wish I had done that” – serendipitous discoveries that any number of people could have made if they had only been looking there.
The possibility of a scoop exists for all of these categories, and have happened, or could have happened.
For example: Penzias and Wilson scooped Dicke’s cosmology group at Princeton in finding the cosmic microwave background. It was a discovery that required near unique facilities, the Princeton group was doing a purpose built experiment while AT&T made the discovery serendipitously, and earlier. As I heard the story, a theorist who heard a presentation by the AT&T group told them about the cosmological speculation for a CMB and they ran away with it. A painful scoop.
A lot of stuff in genetics and biochemistry seems to be in the second category. Discoveries that require hard work by skilled teams and good equipment, but for any given breakthrough there are several groups who could have done the same work, but were too slow or chose a different direction. Scoops there are frequent and inevitable.
I guess I’d say thermostable DNA polymerase was “obvious” after it was discovered.
Another case is the famous M_bh-sigma relation, but let us not go there.
A lot of observations done by national facilities are “obvious”, but the data rights are (usually temporarily) given to the group that put together the best proposal for taking and analysing the data. A lot of Hubble discoveries are like that – there are targets several groups propose for but only one group may get the data, based on their observing strategy, and they have exclusive rights to it for a year.
Transiting planets are a “wish I had done that” discovery. A lot of people could have made the discovery, the equipment was and is widely available, with a bit of hustle and a lot of hard work chasing after targets. Those situations are where a scoop is likely and some delicacy is required to get the data and get out the results without someone cutting you off on the inside.
So, how do scoops happen?
Sometimes it is honestly a case of happenstance in timing.
One case I was involved in, we wrote a Science paper based on some Hubble data that was no longer proprietary. The author combination was serendipitous, literally a matter of who sat down together at a table for dinner, and we pulled together the associated data and got a paper out fast.
After the paper was accepted, but under embargo, I got an e-mail from a graduate student who had noticed some interesting x-ray data, and found there was matching Hubble data which showed something interesting – and his advisor thought he should ask me if he was onto something. And I had to tell him we already had a paper in press on this, and that we had come upon it from the other side as it were.
He was scooped, and we were almost scooped.
Any similarity to a famous Asimov “detective robots” short story is coincidental…
The nastiest scoops come from the competing teams with bleeding edge projects.
A genuine scoop, as distinct from merely coming second in a discovery race, comes when the competing team learns the other team is near publication, and they decide to cut a corner and go public more rapidly, possibly with less conclusive results, but still adequate, particularly after the other team “confirms” the discovery with later, more thorough data.
I won’t try to pick an example, those get too nasty and sometimes ambiguous to the outsider.
The temporarily proprietary data situation can also set up scoops.
Mostly those are a matter of the data owners not getting the results out and someone cherrypicking the data for a quick result when it goes public. This is usually a feature, it is meant to get the data owners motivated, but sometimes this leads to scoops. I was once on a paper using Hubble data, where a referee kept slowing things down and asking that we remove interpretation from the paper and focus on just the data analysis. On the nth round of this, a preprint appeared interpreting our data, just as they became public… that one we called in to the authorities and some improprieties were unveiled.
The unique facilities people are least likely to be scooped – places like LHC or LIGO.
Curiously they tend to be the most tightlipped and cautious on results – the fear there is the retraction – since they can not be scooped they must be certain of the result and not risk the embarrassment of the retraction. So, slow and careful.
This was not always the case, in the past, for example, there were on occasions when there were duplicated facilities and this lead to scoop situation such as the J/Psi case.
So what of the “anyone could do this, if only they knew” discoveries – those are easy to scoop. How do those scoops happen and can they be safeguarded through something less elegant but more straightforward than an anagram on a blog…
So, results leak before publication.
For one thing the referee, and editor, see the results. They mostly respect confidence but referees are human and do talk.
There are also proposals – to get funding, or observing time, the preliminary results and expected progress must be revealed, and the reviewers then know.
That is a potentially nasty situation and I have heard strong assertions of scoops both by referees and reviewers. Those are generally unprovable, especially for “obvious” discoveries, since the referees and reviewers are generally experts in the field and may genuinely already have been working on the problem.
More broadly, people tend to discuss results before publication at meetings, over beer and with trusted colleagues, so rumours spread.
Sometimes the mere rumour is enough to trigger a scoop, since the hint can be enough to put someone on track.
Now, here timelines become an issue: precedence generally goes by publication date.
Submission date is recorded, but rarely helps in the war over precedence.
But sending a preprint to arXiv before publication will easily overrule both submission and publication in perception of precedence.
Sometimes the “first” discoverers just get slowed down by the process, the referee is slow, maybe deliberately so to let allies push their competing result through faster; maybe the other journal just does faster turnaround, some journals specialise in that.
And, sometimes, other groups are just faster or more ruthless – a lone scientist at a small college, or with a single grad student can easily be outraced by a large group at a well funded institution with a PI who can throw postdocs and multiple students at a problem, once it is known to be interesting. The competition can also, as I noted before, decide to try to publish with a less conclusive preliminary result, hoping it is adequate for precedence, particularly if the competition is prone to perfectionism.
So, scoops happen.
Mostly they suck.
Sometimes they don’t matter, “everyone knows” who really did it (actually in those cases it is usually the people outside the field who get the misperception of precedence). Sometimes it matters a lot and things become very bad within a subfield for a while.
Is there a process to short cut this – some sort of idea registry, or a depository of research progress (other than the legendary “lab book” with the date and time scribbled in #2 pencil at 3 am)?
I don’t know.
Idea registries would probably become like modern patents, a small number of people tossing of lots of ideas to claim precedence but without actually doing the hard work.
A “depository of progress” might work – maybe a trusted public web site with encrypted claims, and a one time keypad that was PGP encrypted or something.
So if a claim needed to be asserted a research could reveal a key phrase encoded via PGP that would decode a timestamped research progress claim.
Easy to set up, cheap to maintain, don’t know who would bother.
Though I can see some biogeneticspharmboys needing it.
It is an imperfect system.