Are statistical data and scientific results subject to copyright?

by Jakub Marian

Tip: Are you a non-native English speaker? I have just finished creating a Web App for people who enjoy learning by reading. Make sure to check it out; there's a lot of free content.

I have recently created a map showing the number of books published per year per capita by country in Europe. The figures in the map are partially (less than a half of them) based on data included in a report by the International Publisher Association. I have received a claim, which I consider rather ridiculous, by the person who collected the data for the report; he claims that the map I made is subject to the Association’s copyright. Is such a claim valid?

First, it should be noted that in most parts of the world (including the U.S.), ideas, procedures, and statistical data that can be independently retrieved or replicated (i.e. which are not specific to the subjective viewpoint of their creators) are not subject to copyright (but may be subject to patent laws in certain cases, but this doesn’t really apply to freely available statistical reports).

The EU Database Directive

The situation is slightly more complicated in the European Union, where the so-called EU Database Directive applies, which specifically provides copyright protection to databases. How is a database defined? The directive says:

1. In accordance with this Directive, databases which, by reason of the selection or arrangement of their contents, constitute the author’s own intellectual creation shall be protected as such by copyright. No other criteria shall be applied to determine their eligibility for that protection.

This is further explained by:

2. The copyright protection of databases provided for by this Directive shall not extend to their contents and shall be without prejudice to any rights subsisting in those contents themselves.

In other words, the pieces of data themselves are NOT subject to copyright (of course, unless the data already consists of copyrighted material, such as copyrighted images; nevertheless, no-one can copyright a number); it is only the database itself (i.e. the way in which the data are collected and presented) that is subject to the directive.

A set of numbers is not copyrightable, unless it is arranged or selected in a certain creative way. If I had copied the table itself as it was presented in the report, that would have been questionable, but claiming that putting a bunch of publicly available figures over a blank map violates someone’s copyright is ridiculous.

In fact, they should be happy I used their compilation as a source (which I cited properly in the article) and provided publicity they would otherwise not get. I could have as well just Google the primary sources and simply ignore the report by the IPA, which I have actually done in several cases to check whether the data were correct.

Fair Use

Furthermore, the European directive defines several fair use principles, just like ordinary copyright laws do. The directive specifically says that

Member States shall have the option of providing for limitations on the rights set out in Article 5 in the following cases: [...] where there is use for the sole purpose of illustration for teaching or scientific research, as long as the source is indicated and to the extent justified by the non-commercial purpose to be achieved.

Since this is an educational website and I made the map for the purpose of education of my readers, even if the data themselves were copyrighted (which we have already established is not the case), my use would still count as fair use.

The weird thing about the fair-use part is that it is not mandatory for member countries to adopt it. I’ve read that most EU countries have adopted it, but I cannot find any details as to where it doesn’t hold at the moment. I am wondering what practical consequences this inconsistency may have.

By the way, have you already seen my brand new web app for non-native speakers of English? It's based on reading texts and learning by having all meanings, pronunciations, grammar forms etc. easily accessible. It looks like this:

0