Google's huge book scanning plan

By purepedantry on January 29, 2007.

Jeffrey Toobin, writing in the New Yorker, has an excellent article on Google's plan to scan all the books they can get their hands on into digital:

The legal assertion at the core of Google's business plan is its purported right to scan millions of copyrighted books without payment to or permission from the copyright owners. Approximately twenty per cent of all books are in the public domain; these include books that were never copyrighted, like government publications, and works whose copyrights have expired, like "Moby-Dick." Google has simply copied such books and made them available on the Web. Roughly ten per cent of books are copyrighted and in print--that is, actively being sold by publishers. Many of these books are covered by Google's arrangement with its publisher partners, which allows the company to scan and display parts of the works.

The vast majority of books belong to a third category: still protected by copyright, or of uncertain status, and out of print. These books are at the center of the conflict between Google and the publishers. Google is scanning these books in full but making only "snippets" (the company's term) available on the Web. (Google searches turn up only the search term and about twenty words on either side of it.) Copyright law has never forbidden all "copying" of a protected work; scholars and journalists have long been allowed to quote portions of copyrighted material under the doctrine of fair use. Google maintains that the chunks of copyrighted material that it makes available on its books site are legal under fair use. "We really analogized book search to Web search, and we rely on fair use every day on Web search," David C. Drummond, a senior vice-president at Google who is overseeing the response to the lawsuits, told me. "Web sites that we crawl are copyrighted. People expect their Web sites to be found, and Google searches find them. So, by scanning books, we give books the chance to be found, too." (Google also has an "opt out" policy, which allows copyright holders to request that specific titles be omitted from the company's database.)
However, according to the plaintiffs in the cases against Google, the act of copying the complete text amounts to an infringement, even if only portions are made available to users. "What they are doing, of course, is scanning literally millions of copyrighted books without permission," Paul Aiken, the executive director of the Authors Guild, said. "Google is doing something that is likely to be very profitable for them, and they should pay for it. It's not enough to say that it will help the sales of some books. If you make a movie of a book, that may spur sales, but that doesn't mean you don't license the books. Google should pay. We should be finding ways to increase the value of the stuff on the Internet, but Google is saying the value of the right to put books up there is zero."

Google asserts that its use of the copyrighted books is "transformative," that its database turns a book into essentially a new product. "A key part of the line between what's fair use and what's not is transformation," Drummond said. "Yes, we're making a copy when we digitize. But surely the ability to find something because a term appears in a book is not the same thing as reading the book. That's why Google Books is a different product from the book itself." In other words, Google says that being able to search books on its site--which it describes as the equivalent of a giant library card catalogue--is not the same as making the books themselves available. But the publishers cite another factor in fair-use analysis: the amount of the copyrighted work that is used in the creation of the new one. Google is copying entire books, which doesn't sound "fair" to the plaintiff publishers and authors. "Traditional copyright analysis says that a transformation leads to the creation of a new and independent work, like a parody or a work of criticism," Jane Ginsburg, a professor at Columbia Law School, said. "Copying the entire work, which is what Google is doing, does not preclude a finding of fair use, but it does fall outside the traditional paradigm."

Read the whole thing.

This is of huge import to academics because Google is getting access to these books from Universities. Also, think how much easier it would be to write your thesis if you had instant access to all the materials involved, cross-linked by topic. Research taking months would turn into research taking seconds.

More like this

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Best. Modelling Paper. Ever.

August 17, 2009

The abstract says it all: Zombies are a popular figure in pop culture/entertainment and they are usually portrayed as being brought about through an outbreak or epidemic. Consequently, we model a zombie attack, using biological assumptions based on popular zombie movies. We introduce a basic model…

Journal Editor Speaks about His Experiences

August 10, 2009

(I had this whole post ready talking about flexible representations, but now my computer is borked -- stupid monitor! -- so this is going to have to do.) Tyler Cowen over at Marginal Revolution links to a piece by a former editor at American Economic Review telling all about how papers are accepted…

Obesity is not a myth

July 30, 2009

There is a great conversation going on at Megan McArdle's blog with Paul Campos, author of The Obesity Myth. I say great because it give me the opportunity to show how astonishingly wrong Campos in suggesting that the obesity at the lower end of the BMI spectrum -- not just morbid obesity -- is…

Imaging a Superior Mnemonist

July 15, 2009

In neuroscience, we spend most of our time trying to understand the function of the "normal" brain -- whatever that means -- hence, we are most interested in the average. Under most occasions when scientists take an interest in the abnormal neurology, it is usually someone with who has something…

Key paper in depression genetics disputed

June 24, 2009

I wanted to draw attention to a new paper in JAMA recently because it reveals a lot about how conditional most of the statements we make in behavioral genetics are. Every time you hear a news article that says, "Gene for depression found," I want you to think about this case. Risch et al.…