This is the question I was asking myself while reading this fairly straightforward paper on open access in high-energy physics (hat tip to Garret McMahon).
It’s impossible to be in my particular professional specialty and not know about the trajectory of self-archiving in high-energy physics, but I learned a smallish detail from that paper that intrigues me rather: the existence of SPIRES, a disciplinary search tool that covers both the published literature and gray literature such as preprints on arXiv.
This strikes me as a rare thing. We have disciplinary gray-lit search tools such as RePEc in economics, and we have no end of disciplinary published-lit search tools (despite the considerable expense of securing access to them), but tools that do both? Within a given discipline? I’m not a reference librarian, so discipline-specific search tools aren’t my specialty at all, but I can’t think of anything else on the SPIRES model. There’s WorldCat and Google Scholar, of course, but neither of them is discipline-specific. EBSCO is known to index some library blogs for its library-science databases, but they don’t touch DList or E-LIS as far as I’m aware. Law might have some interesting things going on, given the novel importance of blawgs, but I don’t know of anything firsthand.
SPIRES makes me wonder, it really does. Imagine you’re a high-energy physicist (take that in either sense or both!). You search SPIRES; you know all your colleagues do, too. You have two ways to get your work in SPIRES so that it’s in front of their eyes: pop a preprint on arXiv, or go through the slow process of peer-reviewed publishing, a process that you don’t believe will change your paper much.
This is not the narrative that one typically sees regarding high-energy physics and self-archiving. It’s usually seen as a continuation of a print-culture norm of circulating preprints individually by mail. Still? I wonder.
What is the relevance of this little idyll to research data? This: If data are not indexed where researchers expect to search for disciplinary materials useful to them, will data be used? Taken seriously? Cleaned up and placed online in the first place, even? “Discoverability” of data, in the broad sense of “availability to web search,” may not be enough. Discoverability through discipline-appropriate channels, alongside other trusted materials, may well be the key.
Or so it seems to me.