The Matthew effect in science

Douglas Kell: The Matthew effect in Science - citing the most cited:

The Matthew effect applies to journals and papers too - a highly cited journal or paper is likely to attract more citations (and mis-citations), probably for the simple psychological reasoning that 'if so many people cite it, it must be a reasonable paper to cite' (and such a paper is, by definition, more likely to appear in the reference list of another paper). Clearly that reasoning can be applied whether the paper has been read or otherwise. Simkin and Roychowdhury (2005 and 2007) note that a clear pointer to the citation of a paper one has not read is if it copies a mis-citation, and an analysis of the frequency of such serial mis-citations allows one to estimate, statistically, what fraction of cited articles have actually been read - at least at or near the time of writing a paper - by the citing author. Their analyses show (at least for certain physics papers) that "about 70-90% of scientific citations are copied from the lists of references used in other papers", and that a typical device is to start with a few recent ones plus their citations. Some aspects of this tendency in bibliometrics, especially with highly cited papers, can be detected from the power law form of the distribution of citation numbers, as in the Laws of Bradford and Lotka that I discussed before. Of course the mindless propagation of errors without checking sources properly is hardly confined to Science - a famous recent example with spoof data showed how some journalists simply copied Obituary material from Wikipedia!

I know people do this. Drives me crazy! Every paper I ever cited I read and re-read and re-read. Heck, I even tried to slog through papers in German (which I don't speak) if I thought they were relevant. But copy+paste just because others did? Nope.

More like this

Kevin Zelnio of Deep Sea News tweeted the title of this piece and sent my mind going over the various theories of citation, what citations mean, studies showing how people cite without reading (pdf) (or at least propagate obvious citation errors), and also how people use things but don't cite them…
The gold standard for measuring the impact of a scientific paper is counting the number of other papers that cite that paper. However, due to the drawn-out nature of the scientific publication process, there is a lag of at least a year or so after a paper is published before citations to it even…
Interesting conversation at lunch today: topic was academic performance metrics and of course the dreaded citation index came up, with all its variants, flaws and systematics. However, my attention was drawn to a citation metric which, on brief analysis, and testing, seems to be annoyingly reliable…
(Just to remind you all - I'm away on holiday and I've pre-scheduled the publication of several posts from my old blog at blogspot. This next entry was one that I got a lot of 'tsks, tsks" for - it was intrended tio be a tad toungue & cheek. Incidentally the values of the various h-indexes…

Is it possible that some mis-citations arise more innocently? You would see the same effect if the researcher found the reference in another paper, copied it down (digitally or manually), read the paper properly, understood it, but then used the note of the reference they already have in their own text. Re-checking spelling or exact page numbers is not exactly exciting work!

And anyway, should the researcher necessarily read the entire paper? The reference indicates "this bit originally came from here, it's not my own", so it's an attribution, not an affidavit that the researcher has read each and every word of the paper.

Sam: If the original citation was wrong, how would the researcher have found the paper in the first place? I could see an erroneous paper title propagating in this fashion, but it is very hard to find a paper with an incorrect journal title, volume, or page number. (Not as hard as it was in the days before Google, but still...) That is one reason why getting the citation right is important.

By Eric Lund (not verified) on 03 Mar 2009 #permalink

If I want to look up an astronomy or particle physics paper, I'm just gonna plug the first 1 or 2 authors, and maybe the year, into ADS or SPIRES and download a copy. And the machine-generated BibTeX entry that they provide. Unless there's already an entry in the .bib file shared by my collaboration, potentially hundreds of people. That file was probably formed mostly by combining files from previous collaborations, and will be passed on to others in the future. Any errors will propagate, maybe for decades.
In fact, I just found a paper in SPIRES with my name mangled.

By Warren Focke (not verified) on 03 Mar 2009 #permalink