The arXiv is a game changer for how large portions of physics (and increasingly other fields) are done. Paul Ginsparg won a MacArthur award for his vision and stewardship of the arXiv (something other institutions might want to note when they decide that someone trying to change how science is done isn’t really doing work that will impact them.) So…Given: The arXiv is great. But there is something that’s always bothered me a bit about the arXiv: transparency.
(Note: those of you who wish to complain about the fact that you can’t get endorsed on the arXiv, this article is not for you. Here is a place where that discussion will probably flourish)
Now probably I’m sticking my foot where it is most likely to just get my toes broken, because I must admit, I really don’t know how the arXiv is run! I have, at least, looked a little bit, but really I’ve found out very little. I do know that there is an advisory board (including a few “quantum types” and a guy who likes to misspell “qubits” ? ) And apparently there are advisory boards for different major meta-categories. But why these people are on these boards, and what they actually do is completely opaque to me. What does the arXiv advisory board do? And why are these people the advisers? From my vantage point the arXiv appears to be a strict oligarchy. Of course this might be the way things should be run, but I find it a bit jarring that a shining example of open access is itself, apparently, closed.
Running the arXiv must be a hard task, and I have nothing but words of high praise to describe the staff who must be behind the scenes keeping the gears running. But today, for example, the arXiv was unavailable for more than a few hours. Do any of us know what happened? No. (Update: here is news. Foot meet mouth.) Is it likely that we’ll ever find out? I wonder. Of course you can ask: why does this matter? In the case of today’s outage, it probably doesn’t matter much. But do any of us really know, for example, that proper redundancy has been established to keep the arXiv dataset safe from the problems that invariably creep up with that much data? Okay, some of us must know, because some of us are among the oligarchs ?
A further point along these lines: increasingly the information that is in the arXiv is being used in manners that go beyond just the archiving of preprints. As we move to a science where online tools are more important, where open access is a legislated requirement, and where “the data” in computer readable form is a major component of how science progresses, it seems that we, as scientists, should have some say in how the arXiv adapts to these coming changes. Reliability is an issue for those of us who have tried to use the arxiv in interesting or crazy ways. Of course it may be that the way things are working now is fine (I have few complaints), but can we be sure that this will continue?
As you can see this post is simply full of questions. That’s because I really don’t know the answer to these questions. But I do think this is a discussion that hasn’t really been had, at least that I know about. (Of course, “foot meet mouth” sounds a lot like “quantum pontiff” in dufuseaze.) So: is the arXiv too opaque, or just the right shade of transparency, like, you know, the kind you see through a nice cold glass of beer.