This is a pushmi-pullyu post. I need some help with an environmental scan, so I'll get us started and the rest of you smart folks can amplify my knowledge.
I want to understand what's going on where with data curation specifically at the institutional level (no NOAA, no ICPSR, none of that) Stateside. Grant-funded is fine, though I'm doubly curious about programs that have been weaned (or are weaning themselves) off the grant money. Here are the programs I know about offhand:
- Institutional data curation: San Diego Supercomputer Center (right? I'm not entirely sure what they offer vis-a-vis long-term data stewardship), Purdue's D2C2, Cornell's DataStaR.
- Subject-specific but still (mostly?) institution-focused: Cornell's CUGIR (there must be a lot more GIS out there, mustn't there?), North Carolina's DRYAD
- Data-curation training: Illinois, North Carolina.
Tell me what I'm missing, please and thank you.
- Log in to post comments
More like this
The publisher Information Today runs a good and useful book series for librarians who find themselves with job duties they weren't expecting and don't feel prepared for. There's The Accidental Systems Librarian and The Accidental Library Marketer (that one's new) and a whole raft of other accidents…
There have been a number of piercing calls for training of data professionals (of various stripes) in the last year or so. Schools of information have been answering: Illinois, North Carolina, others.
Honestly, I'm getting a sinking feeling in my stomach. If I were to label it, the label would go…
One of the problems practically every nascent data-curation effort will have to deal with is what serials librarians call the backfile, though the rest of us use the blunter word backlog.
There's a lot of digital data (let's not even think about the analog for now) from old projects hanging around…
I've lived all my short career in academic libraries thus far on the new-service frontier. In so doing, I've looked around and learned a bit about how academic libraries, research libraries in particular, tend to manage new services. With apologies to all the botanists I am about to offend by…
I'm not sure if this is what you are looking for but we built an open software product called ir+ which allows researchers, faculty and staff to author, share, collaborate and optionally publish into the repository - all rolled into one application. We are running it in production here:
https://urresearch.rochester.edu
That's close... but it's mostly for collaboration on publications, yes? I'm hoping for things a little closer to raw data. (Which isn't to say IR+ couldn't go in that direction! Very likely it could.)
That said, I completely forgot to mention UPEI's virtual research environments, as well as HubZero and Scratchpads.
Well, actually it's a private workspace for sharing of any type of file or data with no restriction on usage - so it can be used for either - it's up to the researcher.
Size does become an issue on upload - on campus; we have had examples of uploaded files larger than 200 MB. However, I'm not sure what raw data set size you are looking into. Gigabyte size data sets may be another story.
Right! That's right up my alley, then. Thanks! (And for small science, I think you've come pretty close to the 80/20 point. I suspect they'll let you know if/when they need more!)
Also noting this report, which answers my own question about San Diego.
The University of Idaho has INSIDE Idaho, the statewide clearinghouse for GIS data.
http://insideidaho.org/
You'd surely have to include UC's CDL... their digital preservation initiative has even been re-branded University of California Curation Center (UC3).
And surely Johns Hopkins would be worth a mention; their Datanet-funded activity stems from local roots.
Several of the NDIIPP projects have strong local rots as well. Georgia Tech would be a good example, I think.
Excellent. Many thanks! (Now, I will wander off and try to find out why my emailed comment notifications aren't working properly...)