Women of the Arxiv

Over at FiveThirtyEight, they have a number-crunching analysis of the number of papers (co)authored by women in the arxiv preprint server, including a breakdown of first-author and last-author papers by women, which are perhaps better indicators of prestige. The key time series graph is here:

Fraction of women authors on the arxiv preprint server over time, from FivethirtyEight.

Fraction of women authors on the arxiv preprint server over time, from FivethirtyEight.

This shows a steady increase (save for a brief drop in the first couple of years, which probably ought to be discounted as the arxiv was just getting started) from a bit over 5% women in the early 90′s to a bit over 15% now. The more detailed discussion in the article is worth reading, and mostly stands on its own.

One thing, though, that I wish they had included was a reference to this graph from the American Institutes of Physics showing basically the same trend:

Fraction of Ph.D.'s in physics awarded to women, as a function of time. From the AIP Statistical Research center.

Fraction of Ph.D.’s in physics awarded to women, as a function of time. From the AIP Statistical Research center.

That’s showing the fraction of physics Ph.D.’s earned by women over the years, and rises from a bit over 10% in the early 90′s to around 20% now. The data on women in faculty positions is less complete, but shows a similar trend.

The FiveThirtyEight piece, by Emma Pierson, covers a lot of issues, but I wish they’d dealt a bit more with this change over time. Because in some ways, that tells you a lot about the underlying dynamics– if the number of papers featuring women as authors simply tracks the number of women in physics in general, that’s one thing. If it rises more slowly than you would expect from the number of women in physics, that would be saying something else, and much less positive. Absent that, it’s hard to know what to really think about the trend Pierson reports.

Of course, it’s a difficult matter to tease this out, and there’s also an issue of subfield distribution– the arxiv started out as exclusively high-energy theory, and has expanded over time to cover a lot more of physics and math, but it’s by no means complete– when I spot an interesting paper in AMO physics, there’s only about a 50% chance that I’ll be able to find an arxiv copy. That’s going to affect the pool from which they’re drawing, which affects what you would expect to see in terms of authorship.

But this kind of basic analysis is a good starting point, and it’s always nice to have more data in the discussion.


  1. #1 prosaica
    August 5, 2014

    I’m curious as to why they didn’t analyze mathematics separately. A lot of mathematics is on the Arxiv (definitely >50% in the subfields I’m familiar with) and, well, we put authors in alphabetical order :).

  2. #2 Chad Orzel
    August 5, 2014

    When she talks about the positional analysis of author lists, she mentions that math was excluded for exactly this reason. It’s a little buried, though.

  3. #3 Steinn Sigurdsson
    August 5, 2014

    Hm, the arXiv was seeded with a preprint database curated by a female scientist, I wonder if that is correlated with the spike in the first year, or if it is due to the change in subfields contributing to the arXiv papers in the early years – clearly a followup analysis is required…

  4. #4 Uncle Al
    August 5, 2014

    A scientist is defined by objective competence, not social agenda. Management is rewarded for quantitative enforcement of rules. Said reward is not contingent upon whether the rules are pertinent to the task, or make any sense at all. The Battle of the Somme was Napoleonic infantry tactics oozing through barbed wire uphill into German machine guns. How many generals were cashiered?


    YOU did it! YOU weren’t happy making babies and driving black cars with stick shifts. YOU wanted pastels and automatic transmissions. YOU wanted to vote, YOU wanted to be in the workplace, YOU wanted to be in the military. YOU wanted Equal Rights, Equal Opportunity, Affirmative Action, diversity, compassion, and daycare. You’ve gotten it lady, trumped in spades redoubled. Work your butt off, abandon any thoughts of family, and watch as real experts in whining being carried in sedan chairs beat you to the finish line every time. Nobody dares hold you dearly for fear of being drawn and quartered by “social activists” and their pro bono legal representation.

  5. #5 Eric Lund
    August 5, 2014

    High-energy experimental papers also frequently list their authors alphabetically, so they would have to be taken out of the analysis as well. On most if not all of the papers published by the ATLAS collaboration, the first author has the surname Aad. I’m pretty sure that he (or she–I only know this person’s first initial) did not write all of those papers.

    As for last authors, I know it’s common in biomedical fields to put the authority figure in that position, which has the benefit of making it easy to identify all papers coming from a particular group: look for author lists ending with I. M. Last, and you’ll get an overview of what Dr. Last’s group has been doing. But I’m not aware of any physics fields (other than possibly biophysics, which would reasonably follow the biomedical convention) that assign any particular significance to the last author of a paper with more than two authors–my own field certainly does not.

  6. #6 Chad Orzel
    August 5, 2014

    The “PI is last author” thing is pretty common in AMO physics. It’s true for all of the papers I’m on, save one, where the theorists upstairs just concatenated two group lists (they were analyzing data we took), and put us at the end, giving Steve Rolston the last author slot when it probably should’ve been Paul Julienne.

  7. #7 prosaica
    August 5, 2014

    She did include mathematics fields in the graph “How much more likely…”
    Still, it’s better than 538′s analysis of the World Championship :).

  8. #8 CCPhysicist
    August 7, 2014

    The data on women on the facutly from AIP is quite striking. The breakdown by rank is much less important than the breakdown by type of college. The bias against women at PhD schools is decreasing, but still quite noticeable.

    I used to collaborate with experimentalists, and the pattern on those papers was that the person who did all the experimental work (usually a grad student) was first author and the theorist who oversaw another part of the analysis was listed last. That was not, however, universal in our field or even in our own collaborations. Since this was mostly back when author lists were broken up based on institution, it could be the case that the first name of the last group played a key role.

    Very risky to take the last author in our lists seriously, but the it seems plausible that you can read the first-author results from 538 as indicative of the increasing number of female graduate students or post-docs taking that lead role.

  9. #9 Mike
    August 11, 2014

    Along the same lines:

    If we assume the norms discussed for 1st and last author positions, the 1st author position will often be a graduate student or a post-doc, so the % of female first authors would be expected to be close to the % of females earning Ph.D.s in the years close to the paper’s publication.

    If we assume the same norms, the % of last authored papers would be expected to lag the % of females earning Ph.D.s and even the % of female scientists. That’s what was found. The question is really, does it lag more than would be expected or is is about what would be expected. That’s the question I really wish had been addressed.

  10. #10 Pi-Guy Dave
    August 12, 2014

    Uncle Al,

    Did you think to check the caliber of the conversation before deciding it would be useful to post your own personal angers and biases to the blog? If you did check, what makes you feel your post adds to the conversation?
    I must admit to being a little confused by your rather incoherent prose but could you let me know whether, as your post reads, you are genuinely blaming women for the “over-the-top” massacres during WWI?

    Also, are you implying that you believe women should: just be human manufacturing plants; have the vote and employment withheld; still be discriminated against in all walks of life but especially employment and academia (etc etc) on basis of the configuration of their 23rd chromosome?

    It is my experience from previous girlfriends and from my wife that woman love to be “held dearly.” Have you considered that if a woman won’t let you hold her dearly it is because you give the impression of being an angry misogynist, and it is not because all women are some sort of sociopathic feminists? You really should.

  11. #11 Chad Orzel
    August 12, 2014

    I debated whether to clear Pi-Guy Dave’s comment from moderation and risk a pissing contest while I’m out of the country. It wasn’t offensive, though, just ill-advised, so I let it through. If this turns into an extended exchange of unpleasantries, though, I will delete all of the comments, and ban those involved when I get up in the morning (UK time).

The site is currently under maintenance and will be back shortly. New comments have been disabled during this time, please check back soon.