Tidbits

There is, in fact, more to life than the California vs. NPG battle royale. I know, I'm surprised too. It's funny because it's true! Daily Life in an Ivory Basement offers the NSF a data-management plan. Along those same lines, coping with data ranks high in worry factor in this OCLC report on research-related info needs faculty say they have. Rings true, though I don't entirely believe that faculty don't look to the library on copyright; what I believe is that they mostly don't think about it, but on the rare occasion that they do, they look to the library. See also Local scientist learns…
Did you miss the tidbits? I rather did. Data in climate science, and the problem of standardslessness: One database to rule them all, track global temperatures Congratulations to Duke, the latest open-access mandate success! Paolo Mangiafico, on Open Access at Duke University Not all governments are on the open-data bandwagon: When public records are less than public. See also NARA’s Digital Partnership Agreements, featuring the extreme difficulty of paying for large digitization programs without restricting access, at least initially. Which datasets merit preservation? Bryan Lawrence offers…
Tuesday seems a good day for tidbits. (I am head-down in my UKSG presentation and class stuff at the moment, so kindly forgive posting slowness.) One argument I rarely see made for open access that should perhaps be made more often is that it reduces friction in both accessing and providing information. Want to reduce the overhead of responding to FOIA requests? Post the information online. Data, data, we love data! Data is at the heart of new science ecosystem and Preserving the Data Harvest. Oh, and if you hadn't noticed, The Data Singularity is Here. Some good lay-level explanations of…
I'm in Urbana-Champaign this weekend to teach an in-person day for my online collection-development class. I'm looking forward to it; every time I teach I am reminded that students are smarter than I am. For now, tidbits! As world plus dog probably knows already, The Economist tackled the data deluge. Adam Christensen gives us the modest, unassuming Data. The foundation for everything on an intelligent, interconnected, instrumented planet. Rethinking scholarly communication from the ground up: SciBling John Dupuis asks Are computing journals too slow? and Dan Cohen muses about how best to…
It's Friday! Snack on some tidbits. In the "didn't anyone teach you to show your work in grade school?" department, we have NIWA unable to justify official temperature record, as well as the radical notion of using actual data to gauge the effectiveness of review boards in stopping unethical research. In the "open is not a panacea" department, we have Nat Torkington rethinking open data, or at least its funding models (hat tip to Trevor Muñoz), and JISC's Clarion project trying to convince principal investigators that sharing data is a useful thing to do. In the "let's kill all the lawyers"…
I'm home sick today, and not precisely looking forward to giving my class tonight because I really do feel wiped out. Fortunately, tidbits posts are easy… Denmark ponders the future of the research library. A thoughtful read for librarians; a good skim for scientists wondering how libraries will help them in future. Congratulations to Galaxy Zoo for its first published paper based on crowdsourced galaxy-classification data. May there be many more! Code is data too, says Chris Wiggins, arguing that you can't really judge results until you know what's been done to the data. An Economic Argument…
Happy Groundhog's Day Eve! Or something. Jennifer Rohn discusses how suboptimal data management makes downstream tasks such as submitting papers to journals a bit harder. The bit about proprietary image formats is particularly cringe-inducing. Why Cameron Neylon is disappointed with Nature Communications. Nature is a leader among journals; the rest of us need it to get open access right. And speaking of Cameron, Hope Leman does an absolutely brilliant interview with him. The Research Information Network's new research officer asks pertinent questions about data-quality standards in UK data…
Because I scanted you on tidbits for quite some time, have a second tidbits post in a single week! A little library advocacy: Five library resources you should be using. Otherwise-closed data tend to open up in direct proportion to the perceived importance of the problem: GlaxoSmithKline opens up data on anti-malaria compounds. Now let's make this the default stance, shall we? Undergraduate science librarian Bonnie Swoger talks Science Online 2010 and data. Also on the Science Online 2010 roundup, the amazing Kevin Smith of Duke makes trenchant observations about copyright anxiety and it's…
I'm a bit late with these! Sorry about that. Bit busy around me just now. Data-sharing resolutions/requirements announced recently include: the American Naturalist and allied journals (possibly behind paywall, sorry), and the Linguistics Society of America. The calls for open data and data archiving redouble: from mainstream media such as New Scientist, from science bloggers like those at Bench Press, from service providers like Data Dryad. I try to stay out of the futurism game (sometimes unsuccessfully), but here are some eScience predictions for you from others. Conference reports relevant…
Wishing all of us a happy, prosperous, data-filled 2010. Unfortunately behind paywall: Nature says (rightly) that it's not quite as simple as "throw the data out there." Combining datasets carelessly may magnify faults in the original, eliminate crucial explanatory variables, or otherwise make a big hash of things. In which economics and computer science walk hand-in-hand. This isn't precisely data-driven science, but it's in the same neighborhood. "The biomedical sciences have moved on in the past quarter-century." Methods change; so do raw materials and communicative techniques. Money quote…
Every time I do a tidbits post, I think to myself, "gosh, that was a lot of tidbits; I'll never fill up the queue again." Every time, I'm wrong. The climate-data scandal staggers on: Gavin Baker has another great summary post, from which I particularly appreciated the Climategate article. We also have a climate skeptic who won't show his work. For a list of freely-available sources of climate data, see RealClimate or the Comprehensive Knowledge Archive Network. High-profile flu data leads to bizarrely childish behavior. One hopes that data-sharing norms will eventually put a stop to resource-…
I'm at home today owing to last night's epic snowfall in Madison shutting down practically the entire university, so it's time for tidbits! The biggest data story of the week is the climate-data hijacking. Gavin Baker has the best roundup I've seen. I also strongly recommend Cameron Neylon's thought-provoking response. The Digital Curation Blog has a lengthy series of roundup posts on the just-past International Digital Curation Conference. Next year in Chicago! I will be there with bells on. Climate change for libraries. No, nothing to do with the climate data scandal; instead, a cogent…
The tidbits folder is out of control, so this linklist may be a bit epic. My apologies! There's a lot of great discussion in this area of late. Data repositories: the next new wave Steve Hitchcock is sensible, as usual. The answer to "are repositories changing?" is "they already changed," if one asks Carole Palmer. What's lagging, still, is institutional recognition and approval of those changes. See also ERIS's initial thoughts about repositories for researchers. Free the humanities data! says Adam Crymble. Ainsworth and Meredith describe e-Science for Medievalists, but do take a look even…
Have some Friday tidbits! An important biology dataset is losing NSF funding and may fold. Nor (as the article explains) is it the only one. It is impossible to overstate the desperate gravity of the data-sustainability question. Academic libraries, if we are not the white knights here—and we certainly have been in the past; witness arXiv—who is? On a similar theme, Yahoo pulls the plug on GeoCities. O ye researchers relying on consumer-grade web services, or new startups, have an exit strategy! Consumer-grade services die when they lose money. Jason Scott may not come charging to your rescue…
By way of amplifying the signal: the 5th International Digital Curation Conference is coming up in London in December. I will be there in spirit only, I fear, but I hope there will be a Twitter hashtag I can follow? Chris Rusbridge has blogged the program. (If I seem more scatterbrained than usual, it's because most of my spare time and brainspace is currently devoted to building a course I will be teaching online in the spring for Illinois's GSLIS. It's a "Topics in Collection Development" course, which means I have to view things through a lens I'm almost completely unfamiliar with—I don't…
Starting off the week with some juicy tidbits: An extremely nerdy but (for nerds) fascinating examination of XML and its implications for data modeling. Do we have to reduce everything to a relational model? Really? Perhaps not… Notably, it seems to me, this article describes fairly nicely how Fedora works. (For more beating on the humble RDBMS, see this blog post.) White Dielectric Substance in Library Metadata. "Understanding the noise turned out to be more important than understanding the signal." What does that mean for efforts to decide which data to preserve? "I've observed that most…
My del.icio.us tag overfloweth… A challenge to libraries from an information science professor: "I wish I could say that libraries were the obvious organization to take care of data… But… they have not been ambitious, they lack the subject area knowledge, they often lack the technical skills." What say ye, librarian Trogoolies? Cross-disciplinary use of data shines in this account of the decline of the Maya. "Space technology is revolutionizing archeology." Who would have guessed it? On the tools front, take a look at the Tranche Project, aimed at securely sharing datasets among researchers.…
The Book of Trogool turns another page... Social scientists and medical researchers, pay attention to this: "Anonymized" data really isn't—and here's why not. If informaticists aren't starting to run similar analyses on their own "anonymized" data, they should be. This is a serious concern. One for the humanists: the rather vaguely-named Scholarly Communication Institute Report from Virginia. The theme was using spatial data in the humanities. From my SciBling Christina: Anybody can code… but should you? Peer review is for more than published papers. Holding your code close to your chest…
Happy Labor Day, US readers. Time to clean out the "toblog" tag on del.icio.us again: Everyone else has already linked to this Wall Street Journal article on data curation, so who am I to go against the tide? My chief takeaway is the trenchant observation that judging the value of data is not straightforward. One scientist's noise is another's signal, and everything is grist for the history-of-science mill. My friend from ebook days Gene Golovchinsky is learning by experience some hard truths about migration versus emulation. Welcome to the fold, Gene! Let's all play with supercomputers! Fine…
Hello, Monday. My tidbits folder overfloweth. Want to text-mine JSTOR? Looks like you can. Garret McMahon talks about FriendFeed, scholarly communication, and embedded librarianship. Part of the reason I'm here is that I believe, with Garret, that we librarians can't kvetch about gettin' no respect if we don't put ourselves out there in the general research scrum. It's as true locally as globally. Jason Hoyt tells scientists to step up and own their part in the dysfunctionality of scientific communication. Three cheers from this librarian! Indulge me in further Cliff Lynch adulation: check…