Orphan Books: Is Google Robbing the Warehouse?

By bioephemera on April 5, 2009.

It's just not Google's week. A mob of angry villagers north of London formed human chains and chased off the Google Maps car (no word whether they had torches). Microsoft is all up in Google's business (to be precise, they're funding a team at New York Law School's Institute for Information Law and Policy, led by a former Microsoft programmer, which is weighing in on the pending settlement of Google's book-scanning lawsuit). And it's not just Microsoft that's taking aim at Google: the NYT has an overview of the many parties, from librarians to law professors, who have serious doubts about that Google settlement. Talk about a village mob.

In the NYT, Miguel Helft discusses one of the key issues in the Google settlement: so-called orphan books. While the settlement gives authors and publishers a say in how Google uses their books, orphan out-of-print books have no clear rightsholder. Without a rightsholder to opt out of Google's database or grant rights to another organization besides Google, these orphan books basically devolve to Google's custody:

While the registry's agreement with Google is not exclusive, the registry will be allowed to license to others only the books whose authors and publishers have explicitly authorized it. Since no such authorization is possible for orphan works, only Google would have access to them, so only Google could assemble a truly comprehensive book database. (source)

If Google really did have a monopoly on digital access to those out-of-print books, it would be a huge coup. Google could charge other libraries for access to the orphan books in its database, profiting off of volumes which it scanned and digitized, but didn't write or publish - books which currently languish in obscurity (but freedom) in library warehouses across the country.

Robert Darnton, head of Harvard's libraries, says the settlement "takes the vast bulk of books that are in research libraries and makes them into a single database that is the property of Google. Google will be a monopoly." Darnton is not completely opposed to the settlement, but is very concerned about its ramifications for libraries and readers. Back in February, he wrote a long historical perspective on the settlement for the New York Review of Books:

The eighteenth-century Republic of Letters had been transformed into a professional Republic of Learning, and it is now open to amateurs--amateurs in the best sense of the word, lovers of learning among the general citizenry. Openness is operating everywhere, thanks to "open access" repositories of digitized articles available free of charge, the Open Content Alliance, the Open Knowledge Commons, OpenCourseWare, the Internet Archive, and openly amateur enterprises like Wikipedia. The democratization of knowledge now seems to be at our fingertips. We can make the Enlightenment ideal come to life in reality.

At this point, you may suspect that I have swung from one American genre, the jeremiad, to another, utopian enthusiasm. It might be possible, I suppose, for the two to work together as a dialectic, were it not for the danger of commercialization. When businesses like Google look at libraries, they do not merely see temples of learning. They see potential assets or what they call "content," ready to be mined. Built up over centuries at an enormous expenditure of money and labor, library collections can be digitized en masse at relatively little cost--millions of dollars, certainly, but little compared to the investment that went into them.(source)

Danton makes an analogy with the skyrocketing costs of science journals, which now cost thousands of dollars per library subscription. Darnton warns that once a subscription to Google's book search database becomes a standard feature of all libraries, and people come to see it as a necessity, Google can, and will, charge whatever it wants, because it will have no competitors. (Once you've spend twenty minutes or so digesting Darnton's essay, you can see some replies to him here.)

The truth is, I find this whole settlement debate very troubling. Google took the initiative to scan millions of library books, digitize them, and make them searchable on the web. That is a public benefit that did not exist before, and one I am glad to have. On the other hand, just because Google did this first should not preclude another company or organization from also scanning and digitizing books, including orphan books, and making them available under different terms - perhaps for free, in a grand open-access initiative. I'm not clear on how the 134-page settlement affects such future competitors to Google. Danton says,

The class action character of the settlement makes Google invulnerable to competition. Most book authors and publishers who own US copyrights are automatically covered by the settlement. They can opt out of it; but whatever they do, no new digitizing enterprise can get off the ground without winning their assent one by one, a practical impossibility, or without becoming mired down in another class action suit. If approved by the court--a process that could take as much as two years--the settlement will give Google control over the digitizing of virtually all books covered by copyright in the United States.

In Helft's piece, Google's lawyer sort of agrees - although he spins it differently:

nothing prevented a potential rival from following in its footsteps -- namely, by scanning books without explicit permission, waiting to be sued and working to secure a similar settlement (source)

Perhaps. I'm withholding judgment until later this month; I'm going to a panel on the settlement hosted by the Information Technology and Innovation Foundation, and will hopefully get a few of my questions answered. Will update you with what I find out.

More like this

Darnton works at an institution that could easily create a competing book database on its own, if Google starts behaving badly or charging exorbitant fees for access. Harvard has the money and the books already. Princeton or Yale could probably do it, too. I think Google's settlement would actually make things easier for future competitors since they would already have a precedent for negotiating with the various rightsholders.

This is a very interesting issue. If this really does give Google a monopoly on creating and hosting a comprehensive book database, then it is very, very bad. I look forward to your post on the panel!

There are several areas of our economy where we recognize that monopolies are unavoidable and/or more efficient -- electric utilities, water supply, etc. However, in those same cases we have realized the potential abuses inherent in a monopoly and have set up appropriate regulatory commissions to make sure the public gets a fair deal.

I think we have reached the point where we will need to recognize the existence of monopolistic "information utilities" -- like Google's -- and put in place an appropriate regulatory framework.

Thanks for this article. I didn't know there was so much contention over this. Looking forward to the panel.

I have to say that I'm both confused and a little dismayed (being a lover of Teh Google). Is there talk of google actually charging for access to the database? And are they seriously trying to claim that no one else can legally database the orphan books? Or is the fear that they could?

I think the fear is that in the long run, Google would start charging more and more for full-text access to the books, the ability to print them out, etc. While publishers and authors would have a say in the handling of copyrighted books with clear rightsholders, what to do with orphan books would be up to Google, and apparently other people might not be able to database them under this settlement. So yeah, I think people are concerned about the situation a decade or so down the road - not the situation now.

My only concern is that the scanned books not become less accessible to the public after scanning.

I love being able to find 50-year-old books in libraries that cover certain topics better than newer books are ever likely to. I still prefer holding and reading a physical book to reading one or two pages of it on a computer monitor.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

Glyphosate reduces soil biodiversity and decreases the proportion of native species (French)

More by this author

Goodbye to Scienceblogs

September 15, 2011

A few weeks ago, I was notified that if I wished to continue blogging at Scienceblogs/National Geographic, I'd have to agree to new terms. After considering these terms, as well as the decision to ban pseudonymous blogging, I don't feel that the new management and I are on the same page. I have…

SpaceChem!

September 14, 2011

A few months ago I got an email from Zachtronics, creators of the Codex of Alchemical Engineering, about the new indie game called SpaceChem. It was billed as "an obscenely addictive, design-based puzzle game about building machines and fighting monsters in the name of science." What's not to love…

Mechanical butterfly, circa 1911

September 14, 2011

Check out this great slideshow of fascinating advertising novelties from 1911, over at Scientific American.

Pseudonymity: Five Reasons the New Scienceblogs/NG Policy is Misguided

September 14, 2011

Recently, Scienceblogs/National Geographic decided it would no longer host pseudonymous science bloggers. As a result, many of my former colleagues have left. I think this decision was wrong. Read on for my reasons. One: simple fairness. Several well-established pseudonymous bloggers had been…

Seeing the invisible? There's an app for that

September 8, 2011

This video from Xperia Studio very effectively conveys how data visualization can both leverage and challenge our conceptions of "reality." The night sky we've seen since childhood, like everything else we see, is just a tiny slice of the spectrum - only what we can perceive with our limited…

Orphan Books: Is Google Robbing the Warehouse?

More like this

Goodbye to Scienceblogs

SpaceChem!

Mechanical butterfly, circa 1911

Pseudonymity: Five Reasons the New Scienceblogs/NG Policy is Misguided

Seeing the invisible? There's an app for that

Update on West Africa's #Ebola Outbreak: Getting worse

Ask Ethan #13: Where does matter come from?

What's so special about special relativity? (Synopsis)