Data longa, tractatus brevis

By dsalo on April 4, 2010.

Dan Cohen has an extraordinarily worthwhile post recounting his talk at the Shape of Things to Come conference at Virginia (which I kept my eye on via Twitter; it looked like a good 'un).

I see no point in rehashing his post; Dan knows whereof he speaks and expresses himself with a lucidity I can't match. I did want to pick up on one piece toward the end, because it has implications for library and archival systems design:

Christine Madsen has made this weekend the important point that the separation of interface and data makes sustainability models easier to imagine (and suggests a new role for libraries). If art is long and life is short, data is longish and user interfaces are fleeting. Just look at how many digital humanities projects that rely on Flash are about to become useless on millions of iPads.

As I've had occasion to mention, scholars generally and humanists in particular have a terrible habit of chasing the shiny. If Dan's post helps lead to an ethic of "sustainable first, shiny later," I will be a very, very happy camper. (I note that Dan's shop has firsthand experience with losing older projects to the shiny—non-standardized Javascript, if I recall correctly. Dan speaks from a position of hard-earned wisdom!)

The answer to this conundrum is not, however, "avoid the shiny at all costs!" It can't be. That will only turn scholars away from archiving and archivists. To my mind, this means that our systems have to take in the data and make it as easy as possible for scholars to build shiny on top of it. When the shiny tarnishes, as it inevitably will, the data will still be there, for someone else to build something perhaps even shinier.

Mark me well, incidentally: it is unreasonable and unsustainable to expect data archivists to build a whole lot of project-specific shiny stuff. You don't want your data archivists spending their precious development cycles doing that! You want your archivists bothering about machine replacement cycles, geographically-dispersed backups, standards, metadata, access rights, file formats, auditing and repair, and all that good work.

So this implies a fairly sharp separation between the data-management applications under the control of the data archivists, and the shiny userspace applications under the control of the scholars. How many of our systems have, or even imply, such separation?

DSpace doesn't, to my everlasting annoyance. (Try building a userspace application on top of materials in DSpace but wholly outside it. Just try.) Omeka doesn't—sorry, Dan. Not Greenstone, not EPrints, not ContentDM, not any of the EAD systems out there, not DLXS. All of these are built as silos, their APIs somewhat to appallingly limited. I'm here to say, the data silo needs to die, and the sooner the better.

Fedora Commons has this right. I say again: for all its faults, and it has them, Fedora Commons has this piece right. I also like what I see coming out of places like the Library of Congress, the California Digital Library, and the University of North Texas.

But let's stick with Fedora, because it's what I know best. Fedora isn't even trying to be the whole silo; it punts on the userspace problem entirely. It doesn't have a web user interface that anyone other than a command-line addict would recognize. What it has is a reasonably comprehensive (and improving) API on which any number of interfaces can be built.

Since "any number" is the exact number of interfaces that will need to be built (and coexist) over wildly varying data… you see why I think this the right approach. If you want to see this approach in action, you need seek no further than Islandora and its Virtual Research Environments.

Here's the fun bit: it doesn't take the University of Prince Edward Island's developers to create a new VRE. Any Drupal dev willing to learn about Fedora's view of the universe and reverse-engineer some of UPEI's code can do it. That's a fair few devs.

And that's the way the world will have to be. Data longa, tractatus brevis.

More like this

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

Fossil discovery is a new missing link in modern fish evolution

More by this author

We're moving!

August 3, 2010

Looking for us? We're happy to say that we're part of the new Scientopia blogging collective. Come see us there!

Belated Zombie Day post

July 13, 2010

Oh, if I'd only had this picture for Zombie Day... Credit for the photo to UK Serials Group. Credit for the alteration of the speech bubble (you can see the original slide here if you care to) to Steve Lawson. Incidentally, I should have a postprint of an article based on this presentation up…

Promoting a comment: "Open and shared format"

July 8, 2010

Richard Wallis has taken my ribbing in good part, which I appreciate; his response is here and will reward your perusal. He also left a comment here, part of which I will make bold to reproduce: As to RDF underpinning the Linked Data Web - it is only as necessary as HTML was to the growth of the…

Small fry, blogging networks, and reputation

July 8, 2010

So, the PepsiCo blog thing. Right. Advance disclaimer: this is me talking, not either of my illustrious co-bloggers. We have not yet made a decision about what to do; one co-blogger is across the pond at a conference and the other is vacationing, so that discussion will have to wait a bit. This is…

I'd love to dance with you, but...

July 6, 2010

Richard Wallis of Talis (a library-systems vendor) posted The Data Publishing Three-Step to the Talis blog recently. My reaction to this particular brand of reductionism is… shall we say, impolitic. I just want to pat Richard on the head and croon "Who's the clever boy, then? You are! Yes, you are…