One thing that kind of bugs me is that people answer the question "what impact has your funding had" with things like "I hired 3 postdocs and 2 support staff." Dr Lane talked about this at the workshop, but to some extent, I don't think her solution actually got at the bigger problem: societal impact. How has your research - done with our money - made the world a better place (maybe it hasn't, but that's ok, too). In the last post I mentioned a way I think we could start to learn more about how much scientific articles were taken up in the general media. This is at least opportunity for public engagement if not actually making the world a better place.
Another thing from the workshop - that I and others keep coming back to is the strikingly different behavior of the Google user vs. the astronomer user of ADS. Kurtz mentioned that Springer has something like
6090% of their hits from Google. I suspect IEEE and maybe ScienceDirect are about at that level, too. So I'd like to see (or be pointed at, if it already exists) a study that clusters and names behavior types of users of these Google-able science digital libraries.
- How much traffic comes from Google? Of that traffic, what % are from recognized IPs; that is, those institutions that subscribe to this platform or have at least registered with the platform?
- Based on the activities, actions, clicks, time.... can the users be clustered? Can these clusters be named? Of these named clusters, can we identify k-12 students? k-12 teachers? undergrads? scientists outside of the specialty targeted by the system (like physicists visiting ADS, astronomers using SPIRES)?
- Can these clusters, and their frequency of occurrence and behaviors be used to describe or better understand the impact of this system, and the scientific knowledge held by it on the broader public?
Update: Michael Kurtz corrected me that 90% of Springer's traffic comes from Google. He also suggests some places to look for studies of this type. (thanks!)
Bollen and my ARIST article discusses the type of user issue somewhat, and gives some references. The UCL group (Nicholas, Huntington, ...) have done quite a lot of work in this area; Tenopir and King have also written about this.
The Springer Google fraction is over 90%, one of the publishers yesterday said that was true for his stuff also.
One of Johan's papers looks at the Cal State Univ logs and he identifies clusters of different types of students and researchers.
What I do when I end up at a Springer link from google is that I plug the title of the article back into google to see if I can find a version that isn't locked up. If I can't, I give up.
(This is for computer science.)
It would be quite interesting to see how this changes as more institutions are adopting pre-indexed discovery layer software (Summon, Primo Central, EBSCO Discovery Service etc). The intention is that they will make the library portal more relevant so students and researchers will go there to search first, rather than Google.
Yeah, I'm curious about that, too. The pre-indexed discovery layer is theoretically awesome, but I guess time will tell if it really changes the game in practice.