Data, Copyrights, And Slogans, Part II

In the first post, I talked about how factual data aren't creative works, and how compiling them into collections doesn't make them creative - at least in the US.

This aspect of data rips away the core "incentive" provided by copyright law to creators: the right to sue people who make copies. It also has a second aspect, which is that the international treaties that govern copyright don't apply. Whatever one may think of those treaties, they do a fair amount to normalize the laws worldwide - a copyright on a Britney Spears tune applies in much the same way in wildly different countries. For this post, I'm going to focus on the EU's attempt to create an incentive for database creators and its impact.

The first strand here is the idea of incentives. Most of the time the conversation assumes that a creator needs an incentive that excludes others - that a monopoly is the best way to give creators the reason to create. It's pretty rare that the incentive conversation focuses on the rights of the users. But that's a different argument, so let's talk about the incentives to create.

In the US, the courts have been pretty clear that simply collecting things doesn't make them worthy of protection. That's the Feist decision I wrote about over the weekend. This decision hasn't hurt the US database industry, so it doesn't appear that the monopoly protection is essential to create a thriving economy around data and databases.

But in the EU, the database industry successfully convinced the European Parliament to create a completely new law for data: the Database Directive. The directive does some harmonizing of copyright laws - the same sort of selection and arrangement criteria that are related to the database-as-a-whole and not to the entries themselves. There is a nice course online from UNC that summarizes nicely the concept: "copyright in a compilation or derivative work extends only to the material contributed by the author of such work, as distinguished from the preexisting material employed in the work, and does not imply any exclusive right in the preexisting material."

This isn't the meat and potatoes of what makes the Directive interesting, though.

The Directive also creates a sui generis right around databases. Speaking as a non lawyer, sui generis rights are weird. The name itself is a clue - these are rights that are totally distinctive, unlike other rights, related to their subject matter at a deep and unique level. It's worth noting other kinds of sui generis rights - ship hulls and semiconductor masks are good examples - these things are unique to themselves. Sui generis is also a phrase we can use to describe the duck-billed platypus. For better or worse, it's one of a kind.

Like the platypus, the Directive's sui generis right appears to have been cobbled together from spare parts. It's got some ideas from copyright - like prohibiting copies. Copying that is regulated gets defined as when someone performs "substantial extraction" of data. This concept of substantial is of course itself *not* defined. Rights last 15 years (better than copyrights, for sure).

But the Directive also has the idea that these rights ought to be re-booted any time there is a "substantial change" or "substantial investment" - again, without determining substantial. This means that the directive is essentially perpetuated in the context of databases that are growing.

The Directive was written to provide incentives to creators - the right to sue will surely generate more of an industry, the thinking goes. However, the first evaluation by the EU didn't find evidence for this supposition having been true. Their own words are the best ones:

"The economic impact of the "sui generis" right on database production is unproven. Introduced to stimulate the production of databases in Europe, the new instrument has had no proven impact on the production of databases."

Oh, wow, that's the good stuff. Empirical information and analysis on intellectual property rights is sadly about as common as egg-laying mammals. The study also notes that, during the time period since the Directive was passed, the EU database industry *actually lost ground to the US industry* despite the fact that Euro databases had the platypus and the US had...well, nothing.

The key result of all this is that despite the absence of the incentives supposedly created by "property" rights, database creation in the US not only stayed healthy but out-competed the same industry in the EU. So the incentive argument appears to be a failure here.

On top of that, the existence of the sui generis right screws up then international regimes around data. When we do database integration at Science Commons, we can't use data from the EU because of the impacts of the regime. Weird data licensing regimes screw up data integration (even when they're "open" regimes - this is why the HapMap "clickwrap" license was removed, because it was preventing integration with other genomic databases! See the US government's exact words on the issue, down in Appendix B).

Unfortunately, the facts - that the regime is failing to create incentives and actively screwing up cool things that could happen with data - aren't enough to get rid of the right.

I'll come back to this topic again in a day or two to post more about the attempts to create commons regimes around data, using contracts (bad), sui generis "viral" licensing (worse), and norms (w00t).

More like this

Your discussions caught my eye because I've been wondering what is the legal status of disserations. That is, work in my dissertation was "published" by my university, or at least bound in to a handful of hard copies and copied onto microfilm. I then "published" this same work in peer-reviewed journals, to whom I signed over copyrights. If I were to post results from my dissertaion on a blog, would I be violating a copyright? I see blog reproducing figures left and right. Does citing a figure you've posted without permission from the journal that published it make everything alright? Or is the test commercialism? What about data from my dissertation that never made it into a peer-reviewed publication, who "owns" that?