Google's huge book scanning plan

Jeffrey Toobin, writing in the New Yorker, has an excellent article on Google's plan to scan all the books they can get their hands on into digital:

The legal assertion at the core of Google's business plan is its purported right to scan millions of copyrighted books without payment to or permission from the copyright owners. Approximately twenty per cent of all books are in the public domain; these include books that were never copyrighted, like government publications, and works whose copyrights have expired, like "Moby-Dick." Google has simply copied such books and made them available on the Web. Roughly ten per cent of books are copyrighted and in print--that is, actively being sold by publishers. Many of these books are covered by Google's arrangement with its publisher partners, which allows the company to scan and display parts of the works.

The vast majority of books belong to a third category: still protected by copyright, or of uncertain status, and out of print. These books are at the center of the conflict between Google and the publishers. Google is scanning these books in full but making only "snippets" (the company's term) available on the Web. (Google searches turn up only the search term and about twenty words on either side of it.) Copyright law has never forbidden all "copying" of a protected work; scholars and journalists have long been allowed to quote portions of copyrighted material under the doctrine of fair use. Google maintains that the chunks of copyrighted material that it makes available on its books site are legal under fair use. "We really analogized book search to Web search, and we rely on fair use every day on Web search," David C. Drummond, a senior vice-president at Google who is overseeing the response to the lawsuits, told me. "Web sites that we crawl are copyrighted. People expect their Web sites to be found, and Google searches find them. So, by scanning books, we give books the chance to be found, too." (Google also has an "opt out" policy, which allows copyright holders to request that specific titles be omitted from the company's database.)

However, according to the plaintiffs in the cases against Google, the act of copying the complete text amounts to an infringement, even if only portions are made available to users. "What they are doing, of course, is scanning literally millions of copyrighted books without permission," Paul Aiken, the executive director of the Authors Guild, said. "Google is doing something that is likely to be very profitable for them, and they should pay for it. It's not enough to say that it will help the sales of some books. If you make a movie of a book, that may spur sales, but that doesn't mean you don't license the books. Google should pay. We should be finding ways to increase the value of the stuff on the Internet, but Google is saying the value of the right to put books up there is zero."

Google asserts that its use of the copyrighted books is "transformative," that its database turns a book into essentially a new product. "A key part of the line between what's fair use and what's not is transformation," Drummond said. "Yes, we're making a copy when we digitize. But surely the ability to find something because a term appears in a book is not the same thing as reading the book. That's why Google Books is a different product from the book itself." In other words, Google says that being able to search books on its site--which it describes as the equivalent of a giant library card catalogue--is not the same as making the books themselves available. But the publishers cite another factor in fair-use analysis: the amount of the copyrighted work that is used in the creation of the new one. Google is copying entire books, which doesn't sound "fair" to the plaintiff publishers and authors. "Traditional copyright analysis says that a transformation leads to the creation of a new and independent work, like a parody or a work of criticism," Jane Ginsburg, a professor at Columbia Law School, said. "Copying the entire work, which is what Google is doing, does not preclude a finding of fair use, but it does fall outside the traditional paradigm."

Read the whole thing.

This is of huge import to academics because Google is getting access to these books from Universities. Also, think how much easier it would be to write your thesis if you had instant access to all the materials involved, cross-linked by topic. Research taking months would turn into research taking seconds.


