U.S. Library of Congress to archive Twitter

From Twitter, here's the announcement:

Have you ever sent out a "tweet" on the popular Twitter social media service? Congratulations: Your 140 characters or less will now be housed in the Library of Congress.

That's right. Every public tweet, ever, since Twitter's inception in March 2006, will be archived digitally at the Library of Congress. That's a LOT of tweets, by the way: Twitter processes more than 50 million tweets every day, with the total numbering in the billions.

We thought it fitting to give the initial heads-up to the Twitter community itself via our own feed @librarycongress. (By the way, out of sheer coincidence, the announcement comes on the same day our own number of feed-followers has surpassed 50,000. I love serendipity!)

We will also be putting out a press release later with even more details and quotes. Expect to see an emphasis on the scholarly and research implications of the acquisition. I'm no Ph.D., but it boggles my mind to think what we might be able to learn about ourselves and the world around us from this wealth of data. And I'm certain we'll learn things that none of us now can possibly conceive.

Just a few examples of important tweets in the past few years include the first-ever tweet from Twitter co-founder Jack Dorsey (http://twitter.com/jack/status/20), President Obama's tweet about winning the 2008 election (http://twitter.com/barackobama/status/992176676), and a set of two tweets from a photojournalist who was arrested in Egypt and then freed because of a series of events set into motion by his use of Twitter (http://twitter.com/jamesbuck/status/786571964) and (http://twitter.com/jamesbuck/status/787167620).

Twitter plans to make its own announcement today on its blog from "Chirp," the Official Twitter Developer Conference, in San Francisco.

So if you think the Library of Congress is "just books," think of this: The Library has been collecting materials from the web since it began harvesting congressional and presidential campaign websites in 2000. Today we hold more than 167 terabytes of web-based information, including legal blogs, websites of candidates for national office, and websites of Members of Congress.

We also operate the National Digital Information Infrastructure and Preservation Program www.digitalpreservation.gov, which is pursuing a national strategy to collect, preserve and make available significant digital content, especially information that is created in digital form only, for current and future generations.

In other words, if you want a place where important historical information in digital form should be preserved for the long haul, we're it!

Needless to say, this is a pretty incredible announcement. It's great that a major public institution can step forward and do the kind of digital preservation job that only that kind of institution would be capable of.

It would be really great if their next step could be a similar archiving project for, say, Blogger or Wordpress blogs. Or perhaps other big national libraries around the world could each pick a site and dedicate themselves to preserving their content for future generations.

More like this

If you have a moment, this is a useful study to participate in: Do you blog? If yes, then please consider participating in an online survey from the University of North Carolina at Chapel Hill's School of Information and Library Science. The study, Blogger Perceptions on Digital Preservation, is…
With the final countdown underway and the conference less than a week away, this post follows my post on library people in attendance at Science Online 2012 from a few weeks ago. And I'd like to start off with another best-tweet-ever, this time Marieclaire Shanahan retweeting Colin Schutze: + they…
I have mixed feelings about automatic updates of one or more social networking sites from another social networking site. Like when you twitter something and your Facebook status gets the same string of words, or visa versa. I know a few people who do this on a regular basis, and it seems to work…
Many doctoral institutions now accept and archive (or are planning to accept and archive) theses and dissertations electronically. Virginia Tech pioneered this quite some time ago, and it has caught on slowly but steadily for reasons of cost, convenience, access, and necessity. Necessity? Afraid so…

Well, NoAstronomer, that's debatable. Twitter may be largely trivial nonsense but it also has a lot of very valuable information, comment and debate. You can see what people are talking about and time it to the second in real time, so it's great to track, for example, reactions to breaking news.

What I had for breakfast? Not so much, but I presume people will just ignore the trivia.

NoAstronomer- except maybe twittering itself?

Actually, I've already seen comments that suggest that there is a certain U.S. imperialism to the idea that Library of Congress should think they are the appropriate body to archive what is essentially a corpus of global communications. I hope that the LoC will put out an informative announcement soon; the blog entry was light on factual information and strategic aims while heavy on enthusiasm and "ain't we cool".

By Jill O'Neill (not verified) on 15 Apr 2010 #permalink

I really hate this idea. I realize that when you post something on the net, it isn't private, but I don't really want the Library of Congress archiving my activity on the internet.

I think this is a great injustice and a slap in the face to people's privacy rights.

Ethan,

You give up your right to privacy when you make a public announcement via twitter. I don't see how you can call this an infringement on privacy rights.

Want privacy, don't make public announcements. You can't have it both ways.

By Andy Latham (not verified) on 15 Apr 2010 #permalink

Jill O'Neill: "there is a certain U.S. imperialism to the idea that Library of Congress should think they are the appropriate body to archive what is essentially a corpus of global communications" I would be one of the first to accuse the US of inappropriately "taking over" but I would have to say to this that I'm glad some institution is doing it. There is no global organization that would or could do it so it would have to be some specific country. And what other country would have the resources or the motivation to do it?

Ethan Sigel: "I don't really want the Library of Congress archiving my activity on the internet" And that's not what they're doing. They're recording the results of some small portion of your activities. But that's what archiving cultural artifacts is all about. You are part of human society and affect it in some small way and therefore your life is being recorded in an indirect way. Yes, this is on a higher resolution and gathers in more individual contributions but that's because we're all contributing more individually. If you don't want to be "recorded" in this way, don't contribute.