Yet another “academic” article about wikipedia (Content Volatility of Scientific Topics in Wikipedia: A Cautionary Tale; Adam M. Wilson, Gene E. Likens):

Wikipedia has quickly become one of the most frequently accessed encyclopedic references, despite the ease with which content can be changed and the potential for ‘edit wars’ surrounding controversial topics. Little is known about how this potential for controversy affects the accuracy and stability of information on scientific topics, especially those with associated political controversy. Here we present an analysis of the Wikipedia edit histories for seven scientific articles and show that topics we consider politically but not scientifically “controversial” (such as evolution and global warming) experience more frequent edits with more words changed per day than pages we consider “noncontroversial” (such as the standard model in physics or heliocentrism). For example, over the period we analyzed, the global warming page was edited on average (geometric mean ±SD) 1.9±2.7 times resulting in 110.9±10.3 words changed per day, while the standard model in physics was only edited 0.2±1.4 times resulting in 9.4±5.0 words changed per day. The high rate of change observed in these pages makes it difficult for experts to monitor accuracy and contribute time-consuming corrections, to the possible detriment of scientific accuracy. As our society turns to Wikipedia as a primary source of scientific information, it is vital we read it critically and with the understanding that the content is dynamic and vulnerable to vandalism and other shenanigans.

The content of the paper is laughably thin. After a brief anecdotal introduction about blowjobs in 2011, we come to the brief substance of the paper:

We compared three topics we consider to be politically (though not scientifically) controversial (acid rain, global warming, and evolution) and four we consider to be politically uncontroversial (heliocentrism, general relativity, continental drift, and the standard model in physics)… We then calculated three metrics from each article’s history: 1) daily edit rate (excluding successive edits by the same user, n = 23,156), 2) mean edit size (the total number of words inserted, deleted, or changed) on days with at least one edit (n = 8,525), and 3) mean number of page ‘views’ per day (which includes requests by computer programs, only available after 2008-01-01)… The geometric means (±SD) ranged from 0.2±1.4 edits per day and 9.4±5.0 words changed per day for the standard model to 1.9±2.7 edits per day and 110.9±10.3 words changed per day for global warming… the three ‘controversial’ topics each had greater mean edit rates than each of the ‘noncontroversial’ topics (p<0.05)...

That’s about it, really.

A better analysis

The above is largely useless, because the original assertion was The high rate of change observed in these pages makes it difficult for experts to monitor accuracy and contribute time-consuming corrections, although that really should have been phrased as a question. So, let’s look at a few days of global warming’s history. For 2015-08-14:

* BG19bot m . . (174,469 bytes) (-2)‎ . . (WP:CHECKWIKI error fix. Syntax fixes. Do general fixes if a problem exists. – using AWB (11377))
* Isambard Kingdom . . (174,471 bytes) (-138)‎ . . (Undid revision 675994710 by Jamalmunshi (talk) Self-serving link removed.)
* Jamalmunshi . . (174,609 bytes) (+138)‎ . . (→‎External links)

So, that’s easy to check: someone added a promo link and it got reverted; and a bot fixed something.

The previous day:

* Short Brigade Harvester Boris (talk | contribs)‎ . . (174,471 bytes) (+46)‎ . . (→‎Initial causes of temperature changes (external forcings): give Milankovitch his own section (albeit short), consistent with other external forcing)
* Short Brigade Harvester Boris (talk | contribs)‎ . . (174,425 bytes) (-63)‎ . . (→‎Initial causes of temperature changes (external forcings): tighten a little)

Those are both Boris, so you can trust them, and he’s even explained them. There’s nothing on the 12th, the 11th is Boris again (one of which is almost interesting, but its just pruning). Nothing on the 10th or the 9th, and on the 8th only more pruning by Boris. Nothing on the 7th, 6th or the 5th (are you starting to see a pattern here?). On the 4th there is vandalism which lasts 8 minutes.

We need to go back to the 25th of July for anything interesting; Jcardazzi (talk | contribs)‎ . . (176,821 bytes) (+1,884)‎ . . (→‎Warmest years: Jan-June 2015 records), which turns out to be the stuff Boris pruned later.

In short, an entire month’s worth of edits can be checked in less than five minutes. Indeed you can check the changes since the 6th of June (which happens to be at the bottom of the 50-changes default history page) with one diff.

The abstract ends with As our society turns to Wikipedia as a primary source of scientific information, it is vital we read it critically – I think they need to take their own words to heart.

Update: a note on the stats

As alluded to in the comments here, there are problems with the statistical analysis, brief as it is. For example:

The geometric means (±SD) ranged from 0.2±1.4 edits per day…

(when I read the article I just blipped over the stats as obviously uninteresting, so didn’t notice this). What can “0.2 +/- 1.4 edits per day” possibly mean? Well, as a textual form written as mean +/- SD, it can represent exactly that. But since edits per day is inevitably bounded below by zero, the form doesn’t mean very much. This looks to be symptomatic of someone running a bunch of numbers through a stats program without thinking. The paper is, nominally at least, per reviewed and its rather disappointing this wasn’t picked up. PLOS ONE has very low standards, as I understand it: as long as the paper isn’t obvious drivel it can be published. This is at the low end of even that rather forgiving scale.

Refs

* Not a ref, but #crypticclimateclue from Gavin: “For who’s sake? Griper role played by emeritus fellow (5,1,6,2)”
* Gizmodo has a crap article, too.
* User_talk:Jimbo Wales#PLOS ONE article about Wikipedia POV-pushing in the sciences

Comments

  1. #1 Boris
    2015/08/16

    Hit a nerve, did it?

  2. […] Source: Content Volatility of Scientific Topics in Wikipedia: Plowshare Prates (anag.) [Stoat] […]

  3. #3 David B. Benson
    2015/08/16

    Yawn.

  4. #4 Karl Lembke
    United States
    2015/08/16

    I’ve sometimes wished for an app that would delve through the edit history of a Wikipedia article and color code text based on how frequently it had been edited over time.
    White — unchanged in the last year or so. Blue, a small number of changes, up to red, more than one change per week on average.

  5. #5 John Mashey
    2015/08/16

    Besides WMC’s issues, there are others regarding statistics.

  6. #6 James Annan
    http://blueskiesresearch.co.uk/
    2015/08/16

    Most interesting thing about this is the 2nd author (assuming it’s the same guy).

    [Likens? Meant nothing to me. In the press releases he gets the noise. Emeritus. Now you say it, he shows up in “merchants of Doubt”, but as one of those who found Acid Rain. Wiki agrees https://en.wikipedia.org/wiki/Gene_Likens -W]

  7. #7 Marco
    2015/08/16

    Yes, that’s the same guy:
    http://www.caryinstitute.org/science-program/our-scientists/dr-gene-e-likens
    Cary is listed as one of his affiliations on the paper.

  8. #8 Raymond Arritt
    El Monte Legion Stadium
    2015/08/16

    OK, so I actually read the paper. It’s amazing that they got a peer reviewed publication out of something so utterly trivial with no attempt at in-depth analysis or insight.

    Download the data, run a couple of R scripts, and write it up. From start to finish it couldn’t have been more than an afternoon’s work.

  9. #9 Hank Roberts
    out toward some edge or other
    2015/08/16

    > Karl Lembke
    > … wished for an app that would delve through
    > the edit history of a Wikipedia article and
    > color code text based on how frequently it
    > had been edited over time.

    Good idea, especially if the back-and-forth-but-no-progress editing has a telling color. I suggest puce.

    Seems to me the warning in the paper is for the youngsters (and elderly, increasingly, I notice) who go there, read the front page text, and think it’s reliable. Do you think young adults nowadays are smart enough to assume they need to read multiple versions of the reference material? I haven’t seen that happening.

    Last time I pointed out a visible comet to a bunch of theoretically educated teenage people, the question from several was “is it going to hit us?”

    http://assets.amuniversal.com/71d1d6b009ca0133f3b1005056a9545d

  10. #10 Raymond Arritt
    in Mojave, in a Winnebago
    2015/08/16

    Now being discussed Chez Jimbo: https://en.wikipedia.org/wiki/User_talk:Jimbo_Wales#PLOS_ONE_article_about_Wikipedia_POV-pushing_in_the_sciences

    You may (or may not) want to link this discussion there.

  11. #11 outeast
    Czech Republic
    2015/08/16

    Reminds me a little of a discussion I had with a colleague a few weeks back… he was arguing in favour of hardcopy over online resources for just such reasons. He later pulled out a roughly 50-year-old encyclopedia (certainly Soviet-era) to back up an assertion on climate science…

  12. #12 John Mashey
    2015/08/16

    More on the stats problems.

    See the Figure inthe paper

    If the distributions are right-skewed, and non-negative, they might be lognormal, or at least close, in which case a standard deviation is useful, but it’s a *multiplicative* SD, not an additive. (Of course, one would want to compute skew and kurtosis of the log distribution, maybe do a normality test, to assess whether it’s close enough to lognormal for the SD’s to be really helpful.)

    Wikipedia says:
    “The geometric standard deviation was proposed for the first time by Kirkwood (1979) as a measure of log-normal dispersion analogously to the geometric mean.[1] As the log-transform of a log-normal distribution results in a normal distribution, we see that the geometric standard deviation is the exponentiated value of the standard deviation of the log-transformed values, i.e. \sigma_g = \exp(\operatorname{stdev}(\ln(A))).

    As such, the geometric mean and the geometric standard deviation of a sample of data from a log-normally distributed population may be used to find the bounds of confidence intervals analogously to the way the arithmetic mean and standard deviation are used to bound confidence intervals for a normal distribution. See discussion in log-normal distribution for details.”

    a) It is hard to understand how 0.5 +/- 2.0 edits/day is a useful characterization, given that there are never negative edits.
    I might believe 0.5 */ 2.0, assuming 2.0 was really the Geometric SD.

    b) but then 36.2 +/- 10.2 and especially 142.3 +/ 22.9 doesn’t seem to use Geometric SD, as that would imply some very large edits, thousands of words, i.e., 142*22.9 = ~3300 words, with 17% bigger (if it were lognormal).

  13. #13 John Mashey
    2015/08/16

    Note: see log-normal distribution, skip to :”Occurrence” section. It is widely-used in some areas of science and engineering.
    It is also useful in computer performance analysis, i.e., in explaining the SPEC CPU benchmarks.

  14. #14 Russell Seitz
    2015/08/16

    The Wiki could hire the pick of the Britannica editors it has un-employed , and set them to casting a gimlet eye on the fact checking fray

  15. […] William M. Connolley points out at ScienceBlogs, the PLOS ONE study does appear to have some shortcomings; he refers to the content […]

  16. #16 Fabian Flöck
    2015/08/18

    @Karl
    “I’ve sometimes wished for an app that would delve through the edit history of a Wikipedia article and color code text based on how frequently it had been edited over time.”

    We are actually working on that. Right now you can only show the author of the word, but we will soon implement that specific function as well:
    http://f-squared.org/whovisual/ –> whoCOLOR

  17. #17 Dave Mellert
    2015/08/18

    I think you are being unfair to PLoS ONE. The only difference between it and any other journal is the lack of requirement for scientific impact (sexiness, really). It is supposed to be equally as rigorous as any other journal with regard to strength of argument, sound methodology, statistical rigor, etc. You can find as many (or more) dodgy stats in Science or Nature.

    [I don’t think that’s true. Even leaving aside the dodginess of the stats, what’s really striking about this paper is its thinness. You wouldn’t get away with that in a “traditional” journal -W]

  18. #18 Hank Roberts
    lagging far behind, somewhere in the dust cloud
    2015/08/21

    http://catless.ncl.ac.uk/Risks/28.90.html#subj4
    and subsequent items are relevant

New comments have been disabled.