Archive | Copyright RSS for this section

Machine Learning & EU Data Sharing Practices

By Mauritz Kop[1]



Data sharing or rather the ability to analyse and process high quality training datasets (corpora) to teach an Artificial Intelligence (AI) model to learn, is a prerequisite for a successful Transatlantic AI ecosystem. But what about intellectual property (IP) and data protection?

In our turbulent technological era, tangible information carriers such as paper and storage media are declining in importance. Information is no longer tied to a continent, state or place. Information technology such as AI is developing at such a rapid, exponential pace that the legal problems that arise from it are to a large extent unpredictable.


  1. Legal dimensions of data

Data, or information, has a large number of legal dimensions.[2] Data sharing is associated with IP law (right to prohibit and reimburse), fundamental rights (privacy, data protection, freedom of expression and other constitutional rights)[3], fiscal law (taxation), contract law and international commercial law (e-commerce, trade treaties, anti-trust law, consumer protection).[4] In addition, the handling of personal data has ethical, social and techno-philosophical facets.

Legal ownership of data does not exist

In most European countries, the law of property is a closed system.[5] This means that the number of proprietary rights in rem, which are rights enforceable against everyone, are limited by law. Legal ownership of data therefore does not yet exist. From a property law point of view, data cannot be classified as ‘’res’’, as an intangible good or as a thing in which property rights can be vested. Data does have proprietary rights aspects and represents value.

Data that represent IP subject matter

Data that represent IP subject matter are protected by IP rights.[6] Data that embody original literary or artistic works are protected by copyright. New, non-obvious and useful inventions represented by data are protected by patents. Data that epitomize independently created new and original industrial designs are safeguarded by design rights.[7] Confidential data that have business or technological value are protected by trade secret rights.[8]

Sui generis database rights

Hand-labelled, annotated machine learning training datasets are awarded with either a database right or a sui generis database right in Europe.[9] Although the 1996 Database Directive was not developed with the data-driven economy in mind, there has been a general tendency of extensive interpretation in favor of database protection.[10] A database right can be qualified as either a neighboring (ancillary or related) right (however shorter in duration i.e. 15 years), or a true sui generis IP right, but not as a full copyright. A sui generis database right is an IP right with characteristics of a property right, and is awarded after a substantial investment in creating and structuring the database, be it money or time, has been made. Businesses usually consider hand-labelled, tagged training corpora to be an asset that they can license or sell to another company. This applies to the AI system’s output data as well. As all IP rights, (sui generis) database rights are subject to exhaustion.[11] In the USA, no sui generis database right exists on augmented input or output data.[12] What Europe and the USA do have in common, is that any existing IP rights on input data need to be cleared before processing.

Feeding training data to the machine qualifies as a reproduction of works, and requires a license.[13] The training corpus usually consists of copyrighted images, videos, audio, or text. If the training corpus contains non-public domain (copyrighted) works or information protected by database rights -and no text and datamining (TDM)[14] exception applies- ex ante permission to use and process must be obtained from the rightsholders (for both scientific, commercial and non-commercial training purposes).

Clearance of machine learning training datasets

Unlicensed (or uncleared) use of machine learning input data potentially results in an avalanche of copyright (reproduction right) and database right (extraction right) infringements.[15] Some content owners will have an incentive to prohibit or monetize data mining.[16] Three solutions that address the input (training) data copyright clearance problem and create breathing room for AI developers, are the implementation of a broadly scoped, mandatory TDM exception (or even a right to machine legibility)[17] covering all types of data (including news media) in Europe,[18] the Fair Learning principle in the USA[19] and the establishment of an online clearinghouse for machine learning training datasets. Each solution promotes the urgently needed freedom to operate and removes roadblocks for accelerated AI-infused innovation.

Three solutions

The TDM exceptions where originally not created with machine learning training datasets in mind. Prominent scholars advocating the introduction of robust TDM provisions to make Europe fit for the digital age and more competitive vis-a-vis the United States and China are Bernt Hugenholtz and Christophe Geiger. The ‘Joint Comment to WIPO on Copyright and Artificial Intelligence’ addresses –inter alia– challenges related to machine learning and the much needed freedom to use training corpora. This ‘amicus brief’ discusses solutions such as individual and collective TDM licenses/exceptions, whether for commercial or scientific objectives.

On the other side of the Ocean, Mark Lemley and Bryan Casey introduced the concept of Fair Learning.[20] The authors contend that AI systems should generally be able to use databases for training whether or not the contents of that database are copyrighted. Permitting copying of works for non-expressive purposes will be -in most cases- a properly balanced, elegant policy-option to remove IP obstacles for training machine learning models and is in line with the idea/expression dichotomy.

A third solution could be the establishment of an online clearinghouse for machine learning training datasets. An ex ante or ex post one-stop-shop resembling a collective rights society, however on the basis of a sui generis compulsory licensing system. A framework that would include a right of remuneration for rights holders, but without the right to prohibit data usage for commercial and scientific machine learning purposes.[21] With a focus on permitted, free flow of interoperable data.

Public versus private data

Another legal dimension that we can distinguish is on the one hand public (in the hands of the government) machine generated (non) personal data, and private (in the hands of the business community) machine generated (non) personal data. By machine generated data, we mean in particular information and data that are continuously generated by edge devices in the Internet of Things (IoT).[22] These edge devices are connected via edge (or fod) nodes (transmitters) to data centers that together with edge servers form the cloud. This architecture is known as edge computing.

Legal reform

Mandatory TDM exceptions are a sine qua non for machine learning in Europe.[23] A right of fair, remunerated text and data use to train an AI system needs to be mandatory and without opt outs. Would a broadly scoped TDM exception be an optional limitation, with room for Member States to implement their own rules, the Digital Single Market will become fragmented instead of harmonized. A right to machine legibility that drastically improves access to data, will greatly benefit the growth of the European AI-ecosystem.[24]

Besides implementing broader scoped TDM exceptions, it is opportune that the EU Database Directive 96/9/EC shall be reformed by the EU Commission to prevent that data generated by connected edge devices qualifies for sui generis database right protection. Edge computing data must not be monopolized.[25]

  1. Technical dimensions of data in machine learning

Most AI models need centralized data. In the current, dynamic field of machine learning[26], hand-labelled training datasets are a sine qua non for supervised machine learning, which uses regression and classification techniques to solve its prediction and optimization problems. This process mimics biological cognition. In contrast, unsupervised machine learning, which utilizes association and clustering (pattern recognition) techniques, uses unlabelled (unstructured) datasets as an input to train its algorithms to discover valuable regularities in digital information. Semi-supervised learning employs a combination of structured and unstructured training datasets to feed our thinking machines.

Data in machine learning can be discrete or continuous, numerical and categorical. AI systems that utilize deep learning techniques for predictive analysis and optimization, contain deep layers of artificial neural networks, with representation learning.[27] Artificial deep neural networks (ANN’s and DNN’s) rudimentarily mimic the architecture of human biological brains and are comprised of simplified, artificial neuron layers. Anno 2020 DNN’s do not yet have axon’s, soma, dendrites, neurotransmitters, plasticity, cerebral cortices and synaptic cores. In the field of AI, data mining, statistics, engineering and neuroscience converge.

Deep reinforcement learning

Reinforcement learning does not require existing input datasets. Instead, the model learns from data from simulations and games using a reward system based on continuous feedback. Deep reinforcement learning systems, such as AlphaGo, are not easy to train. Too many correlations in the data interfere with its goal-oriented algorithms’ stable learning process. Inference applies the capabilities of a pre-trained deep learning system to new datasets, to predict its output in the form of new, useful real-world values and information.

Transfer learning is a machine learning method that seeks to apply a certain solution model for a particular problem to another, different problem. Applying a pre-trained model to new (and smaller) datasets can turn a one trick pony into the ultimate synthetic multitasker.

Evolutionary computing uses genetic optimization algorithms inspired by neo-Darwinian evolution theory.[28] Genetic algorithms can be used standalone[29], or to train ANN’s and DNN’s and to identify suitable training corpora.

The approaches described above are all centralized machine learning techniques. Federated learning, in contrast, trains algorithms that are distributed over multiple decentralized edge devices in the Internet of Things. These mobile devices -such as your smartphone- contain local data samples, without exchanging their data samples. The interconnected IoT devices collaboratively train a model under a central server.[30] Federated Learning is a scalable, distributed machine learning approach which enables model training on a large corpus of decentralized data.[31] ‘’Federated learning embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches.’’[32] It brings the code to the data, instead of bringing the data to the code.[33] In other words, there is no need for sharing data.


  1. Data: contracts, property law and trade secrets

IP on training data and data management systems is subject to both property law aspects and proprietary rights in rem that are enforceable against everyone. Data is not a purely immaterial, non-physical object in the legal (not the natural-scientific) meaning of the word. However, if a party to a dataset transaction has acquired a contractual claim right in exchange for material benefits provided by him, there is a proprietary right. This proprietary right in rem is subject to transfer, license and delivery.

The attitude of the parties, and their legal consequence-oriented behaviour when concluding contracts about datasets and their proprietary aspects may perhaps prevail over the absence of a clear legal qualification of data[34] (or information) in the law. In this case, party intentions go beyond the legal void.[35] In other words, legislative gaps can be remedied by contracts.[36]

Legal ownership, or property, is different from an IP right. IP is a proprietary right in rem. An IP right can entail a right to use data, in the form of a license.

Extra layers of rights will not bring more innovation

Raw non personal machine generated data are not protected by IP rights.[37] Introducing an absolute data property right or a (neighboring) data producer right for augmented machine learning training datasets, or other classes of data, is not opportune. Economic literature has made clear that there are no convincing economic, or innovation policy arguments for the introduction of a new layer of rights, especially due to the absence of an incentive and reward problem for the production and analysis of datasets.[38]

Moreover, additional exclusive rights will not automatically bring more innovation. Instead, it will result in overlapping IP rights and database right thickets.[39] The introduction of a sui generis system of protection for AI-generated Creations & Inventions is -in most industrial sectors- not necessary since machines do not need incentives to create or invent.[40] Where incentives are needed, IP alternatives exist. Finally, there are sufficient IP instruments to protect the various components of the AI systems that process data, create and invent.[41] Because of theoretical cumulation of copyrights, patents, trade secrets and database rights, protection overlaps may even exist.[42]

Public Property from the Machine

Non-personal data that is autonomously generated by an AI system and where upstream and downstream no significant human contribution is made to its creation, should fall into the public domain.[43] It should be open data, excluded from protection by the Database Directive, the Copyright Directive[44] and the Trade Secrets Directive.

These open, public domain datasets can then be shared freely without having to pay compensation and without the need for a license. No monopoly can be established on this specific type of database. I would like to call these AI Creations  “Res Publicae ex Machina [45] (Public Property from the Machine). Their classification can be clarified by means of an official public domain status stamp or marking (PD Mark status).[46] Freedom of expression and information are core democratic values that -together with proportionality- should be internalized in our IP framework. Reconceptualizing and strengthening the public domain paradigm within the context of AI, data and IP is an important area for future research.[47]

Data as trade secret

In practise however, to safeguard investments and monetize AI applications, companies will try hard either to keep the data a trade secret or to protect the overall database, whether it was hand-coded or machine generated. From an AI perspective, the various strategies to maximize the quality and value of a company’s IP portfolio can differ for database rights, patents and trade secrets on the input and output of an AI system. Moreover, this strategy can differ per sector and industry (e.g. software, energy, art, finance, defence).

As legal uncertainty about the patentability of AI systems[48] is causing a shift towards trade secrets, legal uncertainty about the protection and exclusive use of machine generated databases is causing a similar shift towards trade secrets. Although it is not written with the data driven economy in mind, the large scope of the definition of a trade secret in the EU means that derived and inferred data can in theory be classified under the Trade Secrets Directive.[49] This general shift towards trade secrets to keep competitive advantages results in a disincentive to disclose information and impedes on data sharing.[50]

In an era of exponential innovation, it is urgent and opportune that both the Trade Secrets Directive, the Copyright Directive and the Database Directive shall be reformed by the EU legislature with the data-driven economy in mind.


  1. EU open data sharing initiatives

Data can be shared between Government, Businesses, Institutions and Consumers. Within an industry sector or cross-sectoral.

Important European initiatives in the field of open data[51] and data sharing are: the Support Centre for Data Sharing (focused on data sharing practices), the European Data Portal (EDP, data pooling per industry i.e. sharing open datasets from the public sector, the Open Data Europe Portal (ODP, sharing data from European institutions), the Free flow of non-personal data initiative (including the FFD-Regulation, cyber security and self-regulation) and the EU Blockchain Observatory and Forum.

A European initiative in the strongly related field of AI is the European AI Alliance, established by the EU Commission. An international project on AI and -inter alia- training data is the “AI and Data Commons” of the ITU (International Telecommunication Union).

EU Data Strategy

On February 19 2020 The EU Commission published its ‘EU Data Strategy’.[52] The EU aims to become a leading role model for a society empowered by data and will to that end create Common European Data Spaces in verticals such as Industrial Manufacturing, Health, Energy, Mobility, Finance, Agriculture and Science. An industrial package to further stimulate data sharing follows in March 2020.

In addition, the EU Commission has appointed an Expert Group to advise on Business-to-Government Data Sharing (B2G).[53] In its final report, the Expert Group recommends the creation of a recognized data steward function in both public and private sectors, the organization of B2G data-sharing collaborations and the implementation of national governance structures by Member States.[54] The aim of B2G data sharing is to improve public service, deploy evidence-based policy and advise the EU Commission on the development of B2G data sharing policy.

In its 2019 Policy & Investment Recommendations, the High-Level Expert Group on Artificial Intelligence (AI-HLEG) also devoted an entire section to fostering a European data economy, including data sharing recommendations, data infrastructure and data trusts.[55] Finally, in a recent report, the German Opinion of the Data Ethics Commission made 75 authoritative recommendations on general ethical and legal principles concerning the use of data and data technology.

Given that data are generated by such a vast and varied array of devices and activities, and used across so many different economic sectors and industries, it is not easy to picture an all-inclusive single policy framework for data.[56]


Dutch vision on B2B data sharing

At the beginning of this year, the Dutch government published a booklet about the Dutch Digitization Strategy, in which it sets out its vision on data sharing between companies. This vision consists of 3 principles:

  • Principle 1: Data sharing is preferably voluntary.
  • Principle 2: Data sharing is mandatory if necessary.
  • Principle 3: People and companies keep a grip on data.

The Dutch Ministry of Economic Affairs is currently exploring the possibilities of encouraging the use of internationally accepted FAIR principles in sharing private data for AI applications. FAIR stands for (Findable, Accessible, Interoperable, Reusable). The Personal Health Train initiative builds on FAIR data principles.[57]

Recent Dutch initiatives in the field of data sharing are the Dutch Data Coalition (self-sovereignty of data), aimed at cross-sectoral data sharing between companies and institutions, the Dutch AI Coalition (NL AIC) as well as some hands-on Data Platform and Data Portal projects from leading academic hospitals, Universities of Technology and frontrunning companies.


  1. Mixed datasets: 2 laws (GDPR & FFD Regulation) in tandem

More and more datasets consist of both personal and non-personal machine generated data; both the General Data Protection Regulation (GDPR)[58] and the Regulation on the free flow of non-personal data (FFD)[59] apply to these “mixed datasets”. The Commission has drawn up guidelines for these mixed datasets where both the FFD Regulation and the GDPR apply, including its right to data portability.[60] Based on these two Regulations, data can move freely within the European Union.[61]


Market barriers for early-stage AI-startups

The GDPR thoroughly protects the personal data of EU citizens. In some cases however, GDPR legislation is also hampering the European internal market with regard to the rapid rollout of AI and data startups (SME’s). This applies in particular to a smaller group of early-stage AI-startups who often lack sufficient resources to hire a specialized lawyer or a Data Protection Officer. Therefore, these companies are hesitant to do anything spectacular with personal data,[62] and otherwise in large public-private consortia in which one operates ‘gründlich’, but where it takes (too) long to create the necessary trust among the participants. This hinders the innovative performance of early-stage AI-startups. In that sense, complex data protection rules do not encourage ambitious moonshot thinking, creative, revolutionary AI and data field experiments and the design of clever products that solve real-world problems. It is paramount that the whole field has a good grasp on the legal dimensions of their data. And that there are no significant restrictions and market barriers in that important early stage.[63] Sharing data is simply a necessary condition for a successful AI ecosystem.[64]

Precautionary principle

A second axiom that has the potential to inhibit rapid scientific advances in the EU -in case of expected large risks or unknown risks- is the precautionary principle. EU lawmakers have a tendency to minimize risk and prevent all possible negative scenarios ex ante via legislation. It doesn’t make drafting directives and regulations faster. Rigid application of the precautionary principle in EU law promotes excessive caution and hinders progress. It remains at odds with accelerated technological innovation.[65]


  1. California Consumer Privacy Act (CCPA 2020)

The GDPR also has some important advantages for European startups and scaleups. The advantage of the GDPR is that it is now the international standard in the field of the use of personal data when doing business internationally.[66] Partly for this reason, California has largely taken over the spirit/contents[67] of the GDPR, and implemented it -with a fundamental American approach- in its own regulations that better protect consumer data and safeguard the trade thereof.[68] The California Consumer Privacy Act (CCPA 2020), state-level privacy legislation, came into force on January 1, 2020.[69] If European startups and scaleups are completely GDPR-proof, there will be no privacy legislation anywhere in the world that will require major changes to their personal data protection policy, including the associated legal uncertainty and legal costs. This is a significant competitive advantage. From that lens, European tech startups and AI-scaleups have a head start on their competitors from outside the European Union.[70]


  1. Future EU AI and Data Regulation: CAHAI & EU Commission Whitepaper

Transformative technology is not a zero sum game, but a win-win strategy that creates new value. The Fourth Industrial Revolution will create a world where anything imaginable to improve the human condition, could actually be built.[71]

The CAHAI (Ad Hoc Committee on Artificial Intelligence), established by the Committee of Ministers of the Council of Europe[72] is currently examining the possibility of a binding legal framework for the development, design and application of AI and data, based on the universal principles and standards of the Council of Europe on human rights, democracy and the rule of law. The CAHAI expects to be able to report by March 2020 on the possibilities and necessity of new legislation.

Both data sharing practices and AI-Regulation are high on the EU Commission’s agenda. On February 19th 2020, the EU Commission published its ‘White Paper On Artificial Intelligence – A European approach to excellence and trust’.[73] Fortunately, the White Paper uses a risk-based approach, not a precautionary principle-based approach. The Commission ‘supports a regulatory and investment oriented approach with the twin objective of promoting the uptake of AI and of addressing the risks associated with certain uses of this new (data-driven) technology.’ [74] In its White Paper, the Commission addresses issues concerning the scope of a future EU regulatory framework and -to ensure inclusiveness and legal certainty- discusses requirements for the use of training datasets.[75] In addition, the Commission contends that independent audits, certification and prior conformity assessments[76] for high risk areas like Health and Transportation, could be entrusted to notified bodies (instead of commercial parties) designated by Member States. The Commission concludes with the desire to become a global hub for data and to restore technological sovereignty.

Pareto optimum

When developing informed transformative tech related policies, the starting point is to identify the desired outcome.[77] In the case of IP policy, that outcome would be to compose a regime that balances underprotection and overprotection of IP rights per economic sector. IP is supposed to serve as a regulatory system of stimulation of creation and innovation that uses market dynamisms to reach this objective.[78] The goal should be no less than a Pareto optimum and if possible a Pareto improvement by incentivizing innovation, encouraging scientific progress and increasing overall prosperity.[79]

Modalities of AI-regulation

Law is just one modality of AI-regulation.[80] Other important regulatory modalities to balance the societal effects of exponential innovation and digital transformation are the actual design of the AI system, social norms and the market.[81] Data governance should be less fixed on data ownership and more on rules for the usage of data.

The goal should be global open data sharing community with freedom to operate and healthy competition between firms, including unification of data exchange models so that they are interoperable and standardized in the IoT.[82] There is an urgent need for comprehensive, cross sectoral data reuse policies that include standards for interoperability[83], compatibility, certification and standardization.[84]

Against this background, strengthening and articulation of competition law is more opportune than extending IP rights.[85] Within the context of AI-regulation and data sharing practices, there is no need for adding extra layers of copyrights, database rights, patent rights and trade secret rights.[86]

Technology shapes society, society shapes technology

Society should actively shape technology for good. The alternative is that other societies, with social norms and democratic standards that perhaps differ from our own public values, impose their values on us through the design of their technology.

AI for Good norms, such as data protection by design and by default, as well as Accountability of controllers and processors, transparency, trust and control should be built in the architecture of AI systems and high quality training datasets from the first line of code.[87] In practice, this can be accomplished through technological synergies such as a symbiosis of AI and blockchain technology. Crossovers can offer solutions for challenges concerning the AI-black box, algorithmic bias and unethical use of data.[88] That way, society can benefit from the benevolent side of AI.

Robust, collaborative AI framework development standards such as federated machine leaning[89] models provide personalized AI and safeguard data privacy, data protection, data security and data access rights. Using Privacy by Design as a starting point, with build in public values, the federated learning model is consistent with Human-Centered AI and the European Trustworthy AI paradigm.[90] As technology shapes society, society shapes technology.

[1] Mauritz Kop, Stanford Law School TTLF Fellow, Stanford University; Managing Partner at AIRecht, Amsterdam, The Netherlands. Correspondence: The author would like to thank Mark Lemley, Begoña Gonzalez Otero, Teresa Quintel, Suzan Slijpen and Nathalie Smuha for valuable remarks on an earlier draft of this article, and Christophe Geiger for his lecture on Big Data, Artificial Intelligence, Freedom of Information and the TDM exception, organized by IViR, 10 March 2020.

[2] Data and information are not always interchangeable terms. From a European trade secrets perspective, it is not clear whether data or datasets fulfill the requirements of Article 2(1) of the EU Trade Secrets Directive (TSD). When data is mentioned in the TSD, the terms seems to be not understood as “datasets” but rather in the context of customer/supplier lists – “commercial data” in recital 2 or “personal data” in Article 9(4). The TSD was not developed with the data-driven economy in mind, but rather on the information society (recitals 1 and 4).

[3] Privacy and data protection are not always interchangeable terms. Privacy is a human right as enshrined in Article 12 of the Universal Declaration of Human Rights.

[4] See for international commercial law aspects: Kristina Irion & Josephine Williams (2019). ‘Prospective Policy Study on Artificial Intelligence and EU Trade Policy’. Amsterdam: The Institute for information Law (IViR) 2019. See for consumer protection: Gabriele Accardo and Maria Rosaria Miserendino, ‘Big Data: Italian Authorities Published Guidelines and Policy Recommendation on Competition, Consumer Protection, and Data Privacy Issues’, TTLF Newsletter on Transatlantic Antitrust and IPR Developments Stanford-Vienna Transatlantic Technology Law Forum, Stanford University, 2019 Volume 3-4. See for unfair competition law, data sharing and social media platforms: Catalina Goanta, ‘Facebook’s Data Sharing Practices under Unfair Competition Law’, TTLF Newsletter on Transatlantic Antitrust and IPR Developments Stanford-Vienna Transatlantic Technology Law Forum, Stanford University, 2018 Volume 2. See for competition law as a driver for digital innovation and its relationship with IP law: Josef Drexl, ‘Politics, digital innovation, intellectual property and the future of competition law’, Concurrences Review 4 (2019), 2-5.

[5] All European Member States have civil law systems. Great Britain, as the USA, has a common law system.

[6] WIPO Conversation on Intellectual Property (IP) and Artificial Intelligence (AI), Second Session,

Draft Issues Paper on Intellectual Property Policy and Artificial Intelligence, prepared by the WIPO Secretariat, December 13, 2019

[7] Ibid. See also:

[8] WIPO is planning to launch a digital time stamping service that will help innovators and creators prove that a certain digital file was in their possession or under their control at a specific date and time. See: ‘Intellectual property in a data-driven world’, WIPO Magazine October 2019 The time stamping initiative is a digital notary service that resembles the BOIP i-Depot, see

[9] Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases (Database Directive): For an analysis of the rules on authorship and joint authorship of both databases and database makers’ sui generis rights, and how to overcome potential problems contractually see: Michal Koščík & Matěj Myška (2017), ‘Database authorship and ownership of sui generis database rights in data-driven research’, International Review of Law, Computers & Technology, 31:1, 43-67, DOI: 10.1080/13600869.2017.1275119

[10] See also CJEU, Case C-490/14 Verlag Esterbauer, The CJEU notes that the term “database” is to be given a wide interpretation. In the case of hand-labelled data for supervised machine learning, application of the Database Directive is not really straight forward. The Database Directive does not distinguish between hand and machine coding in what it protects, only between digital and analogue databases. It has been evaluated for the second time in 2018, see

[11] Mezei, Péter, Digital First Sale Doctrine Ante Portas — Exhaustion in the Online Environment (June 7, 2015). JIPITEC – Journal of Intellectual Property, Information Technology and E-Commerce Law, Vol. 6., Issue 1., p. 23-71, 2015. Available at SSRN: This rule has two exceptions: online transmission of the database and lending or rental of databases do not result in exhaustion.

[12] Bernt Hugenholtz, ‘Something Completely Different: Europe’s Sui Generis Database Right’, in: Susy Frankel & Daniel Gervais (eds.), The Internet and the Emerging Importance of New Forms of Intellectual Property (2016), 205-222. See also SCOTUS landmark decision Feist: Feist Publications, Inc., v. Rural Telephone Service Company, Inc., 499 U.S. 340 (111 S.Ct. 1282, 113 L.Ed.2d 358), No. 89-1909.

[13] See also James Grimmelmann, ‘Copyright for Literate Robots’ (101 Iowa Law Review 657 (2016), U of Maryland Legal Studies Research Paper No. 2015-16) 678,  Access to out-of-commerce works held by cultural heritage institutions also requires clearance. In Europe, this license can be obtained from collective rights organisations (Article 8 CDSM Directive).

[14] The non-technologically neutral definition of ‘text and data mining’ in the CDSM Directive is ‘any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations’.

[15] Whether for research purposes or for commercial product development purposes.

[16] Bernt Hugenholtz, The New Copyright Directive: Text and Data Mining (Articles 3 and 4), Kluwer Copyright Blog (July 24, 2019),  Article 4 CDSM allows right holders to opt out of the TDM exemption.

[17] Ducato, Rossana and Strowel, Alain M., ‘Limitations to Text and Data Mining and Consumer Empowerment: Making the Case for a Right to Machine Legibility’ (October 31, 2018). CRIDES Working Paper Series, 2018. Available at SSRN:

[18] Geiger, Christophe and Frosio, Giancarlo and Bulayenko, Oleksandr, ‘The Exception for Text and Data Mining (TDM) in the Proposed Directive on Copyright in the Digital Single Market – Legal Aspects’ (March 2, 2018). Centre for International Intellectual Property Studies (CEIPI) Research Paper No. 2018-02.

[19] Lemley, Mark A. and Casey, Bryan, Fair Learning (January 30, 2020). Available at SSRN:

[20] Ibid. (supra note 19)

[21] See also WIPO (supra note 6)

[22] Such as in smart cities, smart energy meters, Wi-Fi lamps and user gadgets including smart wearables, televisions, smart cameras, smartphones, game controllers and music players.

[23] Countries with more room in their legal frameworks i.e. less legal barriers to train machine learning models are Switzerland, Canada, Israel, Japan and China.

[24] Ducato and Strowel (supra note 17)

[25] Such an innovation friendly reform directly impacts the Digital Single Market. It is to be hoped that the necessary policy space to realize these much needed revisions exists in Brussels.

[26] For the latest scientific breakthrough in machine learning methods see: Matthew Vollrath, ‘New machine learning method from Stanford, with Toyota researchers, could supercharge battery development for electric vehicles’, February 19, 2020 According to Stanford professors Stefano Ermon and William Chueh the machine isn’t biased by human intuition. The researcher’s ultimate goal is to optimize the process of scientific discovery itself.

[27] An example of such an AI system is a generative adversarial network, which consists of two different neural networks competing in a game.

[28] Drexl, Josef and Hilty, Reto and Beneke, Francisco and Desaunettes, Luc and Finck, Michèle and Globocnik, Jure and Gonzalez Otero, Begoña and Hoffmann, Jörg and Hollander, Leonard and Kim, Daria and Richter, Heiko and Scheuerer, Stefan and Slowinski, Peter R. and Thonemann, Jannick, Technical Aspects of Artificial Intelligence: An Understanding from an Intellectual Property Law Perspective (October 8, 2019). Max Planck Institute for Innovation & Competition Research Paper No. 19-13. Available at SSRN:

[29] For example in NASA Antenna. See: Hornby, Greg & Globus, Al & Linden, Derek & Lohn, Jason. (2006), ‘Automated Antenna Design with Evolutionary Algorithms’, Collection of Technical Papers – Space 2006 Conference. 1. 10.2514/6.2006-7242.

[30] Kairouz, Peter & McMahan, H. & Avent, Brendan & Bellet, Aurélien & Bennis, Mehdi & Bhagoji, Arjun & Bonawitz, Keith & Charles, Zachary & Cormode, Graham & Cummings, Rachel & D’Oliveira, Rafael & El Rouayheb, Salim & Evans, David & Gardner, Josh & Garrett, Zachary & Gascón, Adrià & Ghazi, Badih & Gibbons, Phillip & Gruteser, Marco & Zhao, Sen. (2019). ‘Advances and Open Problems in Federated Learning’,

[31] Bonawitz, Keith & Eichner, Hubert & Grieskamp, Wolfgang & Huba, Dzmitry & Ingerman, Alex & Ivanov, Vladimir & Kiddon, Chloe & Konečný, Jakub & Mazzocchi, Stefano & McMahan, H. & Overveldt, Timon & Petrou, David & Ramage, Daniel & Roselander, Jason. (2019), ‘Towards Federated Learning at Scale: System Design’,

[32] Ibid. (supra note 30)

[33] Ibid. (supra note 31)

[34] Tjong Tjin Tai, Eric, ‘Een goederenrechtelijke benadering van databestanden’, Nederlands Juristenblad, 93(25), 1799 – 1804. Wolters Kluwer, ISSN 0165-0483. The author contends that data files should be treated analogous to property of tangible objects within the meaning of Book 3 and 5 of the Dutch Civil Code, as this solves several issues regarding data files.

[35] Until new European legislation creates clarity, gaps and uncertainties will have to be filled by the courts.

[36] Unfortunately, licensing large datasets commercially almost never works out in practice.

[37] For further reading about IP and property rights vested in private data see Begonia Otero, ‘Evaluating the EC Private Data Sharing Principles: Setting a Mantra for Artificial Intelligence Nirvana?’, 10 (2019) JIPITEC 87 para 1. For non-personal machine generated data see P. Bernd Hugenholtz, ‘Data Property: Unwelcome Guest in the House of IP (25 August 2017), and Ana Ramalho, ‘Data Producer’s Right: Power, Perils & Pitfalls’ (Paper presented at Better Regulation for Copyright, Brussels, Belgium 2017)

[38] Kerber, Wolfgang, ‘A New (Intellectual) Property Right for Non-Personal Data? An Economic Analysis‘ (October 24, 2016). Gewerblicher Rechtsschutz und Urheberrecht, Internationaler Teil (GRUR Int), 11/2016, 989-999. See also Landes, William M., and Richard A. Posner. “An Economic Analysis of Copyright Law.” The Journal of Legal Studies, vol. 18, no. 2, 1989, pp. 325–363. JSTOR,

[39] James Boyle, The Public Domain: Enclosing the Commons of the Mind, (Orange Grove Books 2008) 236

[40] Kop, Mauritz, AI & Intellectual Property: Towards an Articulated Public Domain (June 12, 2019). Forthcoming Texas Intellectual Property Law Journal 2020, Vol. 28. Available at SSRN: The legal concept of Res Publicae ex Machina is a catch-all solution.

[41] Exhaustion of certain IP rights may apply, see note 11. See also Shubha Ghosh and Irene Calbol, ‘Exhausting Intellectual Property Rights: A Comparative Law and Policy Analysis’, (CUP 2018), 101

[42] Ibid. Kop (supra note 40). See also Deltorn, Jean-Marc and Macrez, Franck, Authorship in the Age of Machine learning and Artificial Intelligence (August 1, 2018). In: Sean M. O’Connor (ed.), The Oxford Handbook of Music Law and Policy, Oxford University Press, 2019 (Forthcoming) ; Centre for International Intellectual Property Studies (CEIPI) Research Paper No. 2018-10. Available at SSRN:

[43] This means that there should be no sui generis database right vested in such datasets in Europe. No contract or license will be required for the consent of the right holders for analysis, use or processing of the data.

[44] Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Directives 96/9/EC and 2001/29/EC (CDSM Directive),

[45] Kop (supra note 40). The legal concept of Res Publicae ex Machina is a catch-all solution.

[46] Autonomously generated non personal datasets should be public domain.

[47] Hilty, Reto and Hoffmann, Jörg and Scheuerer, Stefan, Intellectual Property Justification for Artificial Intelligence (February 11, 2020). Draft chapter. Forthcoming in: J.-A. Lee, K.-C. Liu, R. M. Hilty (eds.), Artificial Intelligence & Intellectual Property, Oxford, Oxford University Press, 2020, Forthcoming; Max Planck Institute for Innovation & Competition Research Paper No. 20-02. Available at SSRN: The article debates the question of justification of IP rights for both AI as a tool and AI-generated output in light of the theoretical foundations of IP protection, from both legal embedded deontological and utilitarian economic positions.

[48] Kop (supra note 40). Not opting for the patent route poses the risk of (bona fide) independent invention by someone else who does opt for the patent route instead of the trade secret strategy.

[49] Wachter, Sandra and Mittelstadt, Brent, ‘A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI’ (October 05, 2018). Columbia Business Law Review, 2019(1).

[50] Kop (supra note 40). Besides that, uncertainty about the scope of the TDM exceptions leads to litigation.

[51] For certain AI systems, open data should be required for safety reasons.

[52] European Commission, ‘A European strategy for data’, Brussels, 19.2.2020 COM(2020) 66 final,  &

[53] Towards a European strategy on business-to-government data sharing for the public interest. Final report prepared by the High-Level Expert Group on Business-to-Government Data Sharing, Brussels, European Union, February 2020, doi:10.2759/731415 The report provides a detailed overview of B2G data sharing barriers and proposes a comprehensive framework of policy, legal and funding recommendations to enable scalable, responsible and sustainable B2G data sharing for the public interest.

[54] Ibid.

[55] High-Level Expert Group on Artificial Intelligence, ‘Policy and Investment Recommendations for Trustworthy AI’ (European Commission, 26 June 2019).

[56] Ibid. (supra note 6)

[57] Johan van Soest, Chang Sun, Ole Mussmann, Marco Puts, Bob van den Berg, Alexander Malic, Claudia van Oppen, David Towend, Andre Dekker, Michel Dumontier, ‘Using the Personal Health Train for Automated and Privacy-Preserving Analytics on Vertically Partitioned Data’, Studies in Health Technology and Informatics 2018, 247: 581-585

[58] Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). A new European ePrivacy Regulation is currently under negotiation. Data protection and privacy are two different things.

[59] Regulation (EU) 2018/1807 of the European Parliament and of the Council of 14 November 2018 on a framework for the free flow of non-personal data in the European Union (FFD Regulation).

[60] Practical guidance for businesses on how to process mixed datasets:

[61] Besides the GDPR, the Law Enforcement Directive (LED) regulates requirements aimed at ensuring that privacy and personal data are adequately protected during the use of AI-enabled products and services. LED: Directive (EU) 2016/680 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data by competent authorities for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, and on the free movement of such data, and repealing Council Framework Decision 2008/977/JHA.

[62] I speak from personal experience in our law firm. This concerns especially European AI-startups who often do not have the necessary budget to be properly advised on how to navigate data protection and data sharing regulation. See for a first report that confirms this claim: OECD Report ‘Enhancing Access to and Sharing of Data – Reconciling Risks and Benefits for Data Re-use across Societies’, November 26, 2019, Chapter 4.

[63] A solution that takes away legal roadblocks and encourages market entry of early-stage AI-startups could be targeted government funding in the form of knowledge vouchers.

[64] From this point of view, innovation remains at odds with privacy.

[65] In certain domains, performing independent audits and conformity assessments by notified bodies might be a better option. Especially in a civil law legal tradition, where lawmakers draft concise statutes that are meant to be exhaustive.

[66] With 500 million consumers, Europe is the largest single market in the world.

[67] For a close comparison of the GDPR and California’s privacy law, see Chander, Anupam and Kaminski, Margot E. and McGeveran, William, ‘Catalyzing Privacy Law’ (August 7, 2019). U of Colorado Law Legal Studies Research Paper No. 19-25. Available at SSRN: The article contends that California has emerged as an alternate contender in the race to set the new standard for privacy (which, as mentioned in note 3, is not always the same as data protection).

[68] Mark A. Lemley, ‘The Splinternet’, Lange Lecture Duke Law School, January 22 2020,


[70] Such as China, India, Japan, South Korea and Taiwan.

[71] Autonomous AI agents that utilize data and deep learning techniques to continuously perform and improve at its tasks already exist. AI agents that autonomously invent novel technologies and create original art. These AI systems need data to mature.

[72] The Council of Europe, located in Strasbourg, France is not the same governing body as the European Commission. The Council of Europe is not part of the European Union. The European Court of Human Rights, which enforces the ECHR, is part of the Counsel of Europe.

[73] European Commission, White Paper on Artificial Intelligence – A European approach to excellence and trust, Brussels, 19.2.2020 COM(2020) 65 final,

[74] Ibid.

[75] Ibid.

[76] Alternative Regulatory Instruments (ARIs) such as the AI Impact Assessment, see: See also: Carl Vander Maelen, ‘From opt-in to obligation? Examining the regulation of globally operating tech companies through alternative regulatory instruments from a material and territorial viewpoint’, International Review of Law, Computers & Technology, 2020, DOI: 10.1080/13600869.2020.1733754

[77] See also WIPO (supra note 8). WIPO is comparing the main government instruments and strategies concerning AI and IP regulation and will create a dedicated website that collects these resources for the purpose of information sharing.

[78] Hilty (supra note 47)

[79] Kop (supra note 40)

[80] Smuha, Nathalie A., From a ‘Race to AI’ to a ‘Race to AI Regulation’ – Regulatory Competition for Artificial Intelligence (November 10, 2019). Available at SSRN: The author contends that AI applications will necessitate tailored policies on the one hand, and a holistic regulatory approach on the other, with due attention to the interaction of various legal domains that govern AI.

[81] Lawrence Lessig, The Law of the Horse: What Cyberlaw Might Teach, 113 Harvard Law Review 501-549 (1999)

[82] Otero (supra note 37). For user generated data see: Shkabatur, Jennifer, ‘The Global Commons of Data’ (October 9, 2018). Stanford Technology Law Review, Vol. 22, 2019; GigaNet: Global Internet Governance Academic Network, Annual Symposium 2018. Available at SSRN:

[83] For an example of interconnectivity and interoperability of databases in line with the fundamental rights standards enshrined in the EU Charter: Quintel, Teresa, Connecting Personal Data of Third Country Nationals: Interoperability of EU Databases in the Light of the CJEU’s Case Law on Data Retention (March 1, 2018). University of Luxembourg Law Working Paper No. 002-2018. Available at SSRN:

[84] John Wilbanks; & Stephen H Friend, ‘First, design for data sharing’, (Nature, 2016)

[85] Drexl, (supra note 2). The Fourth Industrial Revolution may even require a complete redesign of our current IP regime.

[86] Kop (supra note 40). For non-IP policy tools that incentivize innovation, see: Hemel, Daniel Jacob and Ouellette, Lisa Larrimore, ‘Innovation Policy Pluralism’ (February 18, 2018). Yale Law Journal, Vol. 128, p. 544 (2019); Stanford Public Law Working Paper; Stanford Law and Economics Olin Working Paper No. 516; U of Chicago, Public Law Working Paper No. 664; University of Chicago Coase-Sandor Institute for Law & Economics Research Paper No. 849. Available at SSRN: See also: Mauritz Kop, ‘Beyond AI & Intellectual Property: Regulating Disruptive Innovation in Europe and the United States – A Comparative Analysis’ (December 5 2019)

[87] Kop (supra note 40)

[88] Combination is the key. Examples of potential unethical use of AI are facial recognition and predictive policing.

[89] See note 30 and 31.

[90] High-Level Expert Group on Artificial Intelligence, ‘Ethics Guidelines for Trustworthy AI’ (European Commission, 8 April 2019). See See also Paul Opitz, ‘European Commission Working on Ethical Standards for Artificial Intelligence (AI)’,

TTLF Newsletter on Transatlantic Antitrust and IPR Developments Stanford-Vienna Transatlantic Technology Law Forum, Stanford University, 2018 Volume 3-4,

U.S. Copyright Office Publishes Report on Moral Rights

By Marie-Andrée Weiss

The U.S. Copyright Office published on 23 April 2019 a report on moral rights entitled Authors, Attribution, and Integrity: Examining Moral Rights in the United States. Karyn Temple, the Office Director, wrote in her introduction that the report focuses “on the personal rights of individual authors and artists, who have often been excluded in broader conversations about copyright legal reforms.”

This concern echoes the philosophy behind laws in European countries which are called “author’s rights” (droit d’auteur) and protect a work as being the imprint of the author’s personality. As it has been explained, for instance, by the European Court of Justice in Infopaq, a literary work is composed by words “which, considered in isolation, are not as such an intellectual creation of the author who employs them. It is only through the choice, sequence and combination of those words that the author may express his creativity in an original manner and achieve a result which is an intellectual creation.”

Since a work expresses the personality of the author, he or she must then be provided moral rights to protect the integrity of the work, as well as his or her right to be presented as the author.  Moral rights are often presented as the main difference between copyright and author’s rights.

Are there moral rights in the U.S.?

Moral rights can be provided by contracts and licenses, which “have been at the forefront of protecting moral rights in the United States for many years and are commonly used in creative industries for that purpose” (p. 39 and p. 127). But what about the law?

The U.S. only acceded to the Berne Convention for the Protection of Literary and Artistic Works in 1988, in which article 6bis provides for moral rights, independently of the author’s economic rights. The author has “the right to claim authorship of the work and to object to any distortion, mutilation or other modification of, or other derogatory action in relation to, the said work, which would be prejudicial to his honor or reputation.”

As this right seems to provide authors a way to prevent fair use of their work, including the creation of derivative works, which are protected by the First Amendment, it is not surprising that the U.S. has not embraced the doctrine of moral rights. The report deals with the tensions between the First Amendment and moral rights (p.28), and calls the fair use doctrine “a vital First Amendment safeguard” (p. 30).

Indeed, no seminal moral right law was enacted after the U.S. joined the Berne Convention, as Congress determined, maybe a little bit hastily, that the United States already provided sufficient protection for the rights of attribution and integrity “through an existing patchwork of laws,” including the Lanham Act and some provisions of the Copyright Act (p. 7, p. 24 and p. 36).

However, in 1990 Congress enacted the Visual Artists Rights Act (“VARA”), section 106A of the Copyright Act, which provides authors of narrowly defined “work[s]of visual art” the right “to claim or disclaim authorship in the work, as well as a limited right to prevent distortion, mutilation, or modification of a work that is of recognized stature.”

In 1996 and 1997, the U.S. ratified the WIPO Performances and Phonograms Treaty, article 5 of which provides a moral right to performers who interpret works of art. Congress also considered in this instance that this right was already protected in the U.S. by the existing patchwork of laws, and that there was thus no need to enact a specific law (p. 26).

Congress however did enact in 1998 the Digital Millennium Copyright Act, which added section 1202 to the Copyright Act. Section 1202 prohibits, in some instances, removing, altering, or providing false copyright management information (“CMI”).

Article 5 of the 2012 WIPO Beijing Treaty on Audiovisual Performances gives performers a moral right in their live performances. While the U.S. signed the Treaty in 2012, it has not ratified it yet, and neither have any of the G6’s members.

Finally, some states have their own moral right statutes, for instance, the California Art Preservation Act of 1979 (p.120).


No need for a blanket moral right statute 

In its just-released report, the Copyright Office found no need to introduce a blanket moral rights statute at this time (p.9). Instead, it suggested amending the Lanham Act and the Copyright Act, as it “believes that updates to individual pieces of the patchwork may be advisable to account for the evolution of technology and the corresponding changes within certain business practices” (p. 39).

The Office also suggested that Congress could amend VARA. The federal statute only applies to “works of visual art,” which are narrowly defined by Section 101 of the Copyright Act as works existing in a single copy or a limited edition. The report noted several cases denied VARA protection because “the work was considered promotional or advertising material” (p.66). The Office recommended, however, that only commercial art created pursuant to a contract and intended for commercial use be excluded from VARA’s scope (p.68).

The Office also suggested that Congress consider narrowly amending section 43(a) of the Lanham Act so that its unfair competition protections would include false representations of the authorship of expressive works. Section 43(a) applies to “false designation[s] of origin, false or misleading description[s] of fact, or false or misleading representation[s] of fact.” The Supreme Court had put a stop in 2003 to the use of Section 43(a) as a substitute for moral right, finding in Dastar Corp. v. 20th Century Fox Films that the section should not be recognized as a “cause of action for misrepresentation of authorship of noncopyrighted works.”

This Supreme Court decision “resulted in the fraying of one square of the moral rights patchwork as originally envisioned by Congress” (p. 54). At stake in this case was the right of attribution of a work in the public domain, which had been commercialized by a third party without indicating the original author. Right of attribution is one of the standard moral rights.

The Office also suggested that Congress add a new cause of action in a new section 1202A of Title 17, so that the author of a work could recover civil damages if he or she can prove that the defendant knowingly removed or altered CMI with the intent to conceal the author’s attribution information.  Indeed, it “is common practice in the digital world for CMI to be stripped from works, disconnecting a work from its authorship and ownership information” (p.86).


Moral Rights and Right of Publicity

The Office recommended that Congress adopt a federal right of publicity law in order to reduce the uncertainty and ambiguity created by the diversity of state right of publicity laws. Almost all of the U.S. states have a right of publicity, whether at via common law, statutory law, or both, but they differ in the length and scope of protection. “As a result, there is significant variability among the protections available to an author depending upon where he or she chooses to live, and the specter of federal copyright preemption looms over many right of publicity claims” (p.117).

The report noted that “the right of publicity had provided authors with causes of action for misattribution of authorship, material alterations to the author’s work, and distribution of the author’s work in connection with inferior packaging and artwork” (p.111).  However, as this right protects the name and likeness of the author or performer, it “cannot address situations where the author’s name or likeness is absent. Thus, the right of publicity can stand as a proxy for the right of attribution against violations resulting from misattribution, but has little to say in cases where the author is not credited at all” (p.113).  It cannot protect the integrity of the work either.

The report briefly noted that “the increasingly accessible video editing technology behind “deepfake” software can not only fundamentally alter the content of an author’s work, but can also lead to social and moral harm for the artists and the subject of the video through malicious use” (p.8). This new technology is likely to trigger new right of publicity laws. For example, New York tried unsuccessfully to enact a new right of publicity statute that specifically addressed the issue of deep fakes.

It remains to be seen if Congress will heed the report’s suggestions. Whether it does or not, the debate on moral rights is likely to continue.

The Controversial European Union Directive on Digital Copyright

By Marie-Andrée Weiss

The EU Directive 2019/790 of the European Parliament and of the Council on Copyright and Related Rights in the Digital Single Market was approved by the EU Parliament on 17 April 2019 and was published on 17 May 2019.  It concludes a long and hard-fought lobbying campaign where authors, internet companies, and the general public fiercely debated the most controversial issues of the Directive, the new related rights of press publishers (Article 15) and the new responsibility regime for online platforms (Article 17).

The Directive also addressed how works in the public domain or out-of-commerce could be used by “cultural heritage institutions,” that is, a library or a museum, and how research organizations could reproduce protected works for scientific research.


Facilitating use of content in the public domain: Article 14

Not all of the provisions of the Directive are controversial. For instance, Article 14 provides that reproductions of works in the public domain cannot be protected by copyright, unless this reproduction is original enough to be itself protected by copyright.

This means that museums and other institutions will no longer be able to claim a copyright on reproductions of works in the public domain which are in their collections. It remains to be seen if some of them will claim that the reproductions are original enough to be protected. Museums may change the way they photograph their works, although it would be difficult to claim that a mere reproduction of a painting is original enough to be protected. It could be, however, possible to claim so for the reproduction of a sculpture, a building, or a garment (clothes can be protected by copyright in the EU).

Cultural heritage institutions are, however, granted by Article 6 the right “to make copies of any works or other subject matter that are permanently in their collections, in any format of medium, for purposes of preservation…or other subject matter.” They are thus given the fair use right to entirely reproduce a work, for preservation purposes only, and even for profit. The museum stores will be well stocked.


“Out-of-commerce works” Article 8

Collective management organizations which are “sufficiently representative of [relevant] rightholders” will have the right to conclude with cultural heritage institutions a non-exclusive non-commercial license for the use of “out-of-commerce works.” This will, for instance, allow books which are no longer published to be copied and distributed by libraries, and orphan works to be featured in museums. Authors will, however, have the right at any time to exclude their works from this scheme.


Data mining

Articles 3 to 5 provide for a copyright exception “for reproductions and extractions made by research organizations and cultural heritage institutions in order to carry out, for the purposes of scientific research, text and data mining of works or other subject matter to which they have lawful access.”

The organizations will have to implement “an appropriate level of security” when storing the works. The rightholder will be able to expressly reserve their rights “in an appropriate manner, such as machine-readable means,” if the work is made available online. It is thus not an opt-in scheme, but an opt-out one, and an author failing to constrain such use by digital marking, or any other method, may not have much recourse.


Digital teaching

Article 5 of the Directive provides for a copyright exception for works used for teaching, when provided by an educational establishment, either on-site or online, through “a secure electronic environment accessible only by the educational establishment’s pupils or students and teaching staff.” This definition encompasses MOOCs, but not blogs, even if the sole purpose of the blogger is to provide information about a particular topic.

The two most controversial articles in the Directive are Article 15, which provides a related right to press publishers, and Article 17, which makes platforms liable for content protected by copyright which are illegally shared online.


Article 15 (formerly Article 11):  a related right for press publishers

Article 15 provides press publishers established in the EU the exclusive right, for two years, to reproduce the works they publish and to make them available to the public, a right which has been named by some of its detractors “ancillary copyright.” Authors retain, however, the right to independently exploit their works.

Recital 54 of the Directive explains that the wide availability of online news is a key element of the business models of news aggregators and media monitoring services, and a major source of profit for them. However, this makes licensing their publications more difficult for publishers, and thus it is “more difficult for them to recoup their investments.”

Not surprisingly, this proposal was fiercely debated, by news aggregators, of course, but also by non-profit organizations that viewed this new right as a threat to free exchange of information on the Web. The rights provided by Article 15 do not apply, however, “to private or non-commercial uses of press publications by individual users.”

Article 15 does not apply to either “very short extracts of a press publication” or to “individual words,” an exception which can hardly be described as a fair use exception. It is nice to know, though, that one has the right to reproduce a single word without having to pay a fee.


Article 17 (formerly article 13):  Towards an EU “DMCA”?

“Online content-sharing service providers” are defined by article 2(6) of the Directive as “provider[s] of an information society service of which the main or one of the main purposes is to store and give the public access to a large amount of copyright-protected works or other protected subject matter uploaded by its users, which it organizes and promotes for profit-making purposes.”

This long definition refers to digital platforms, such as Google or Facebook. They will have to obtain the authorization of the rightholder, for instance, through a license, in order to have the right to share the protected work with the public.

If they do not have this authorization, that is, in almost all cases, they will be liable for unauthorized acts of communication to the public of works protected by copyright, unless they “acted expeditiously, upon receiving a sufficiently substantiated notice from the rightholders, to disable access to, or to remove from their websites, the… works…and made best efforts to prevent their future upload” (Article 14.4(c)).

The platforms will have to put in place “an effective and expeditious complaint and redress mechanism…available to users of their services in the event of disputes.” This requirement is similar to the one put in place in 1998 by the Digital Millennium Copyright Act (DMCA), which provided a safe harbor for online service providers if they “expeditiously” remove or disable access to the infringing material after receiving a DMCA takedown notice.

Several legal scholars, such as Professor Wendy Seltzer and Professor Daphne Keller, have argued that the DMCA is a threat to free speech. Indeed, platforms regularly delete, automatically and zealously, works which are protected by the fair use doctrine upon receiving a DMCA notice. It is likely that the EU scheme will lead to similar overreach.

The Directive is ambiguous as to the way platforms are required to fulfill their new duties. Article 17.8 expressly provides that “application of [Article 15] shall not lead to any general monitoring obligation,” but Article 17.4(b) provides that the platforms must be able to demonstrate that they “made, in accordance with high industry standards of professional diligence, best efforts to ensure the unavailability of [protected] works.” Platforms may be inclined to consider that monitoring content by algorithms is indeed the current “high industry standards of professional diligence.”


Next stop: implementation, on a bumpy road

Member States have up to 7 June 2021 to transpose the Directive into their legal systems, since Directives, unlike Regulations, are not directly applicable in the EU.

However, the road to implementation is likely to be a bumpy one. Poland filed in May a complaint to the Court of Justice of the European Union against the EU Parliament and the EU Council, claiming that Article 17 of the Directive would lead to online censorship. The debate over the Directive is likely to continue.

A Study in Trademarked Characters

By Marie-Andrée Weiss

The characters created by Disney, Marvel, and LucasFilms are valuable intellectual property and are protected both by copyright and by trademark. However, a recently decided case in the Southern District of New York (SDNY), Disney Inc. v. Sarelli, 322 F.Supp.3d 413 (2018), demonstrates that preventing the unauthorized use of such characters may not be as easy as expected.

In this case, Plaintiffs are Disney Enterprises, Marvel Characters and LucasFilm, all of which own copyrights and trademarks in many of the most famous characters in the world, such as Mickey Mouse, Hulk, and Chewbacca. These characters were first featured in movies like Frozen, The Avengers or Star Wars, and are now licensed or featured in derivative products such as comic books, video games, or theme parks. Their exploitation is highly lucrative.

When visiting Plaintiffs’ theme parks, one has a chance to meet the characters “in person.” This experience is also offered by Characters for Hire, a New York company offering, as the name implies, character hiring services. The company’s principal owner is Nick Sarelli (Defendant). Characters for Hire offers a service wherein actors dressed in costumes entertain guests during birthday parties or corporate events. For example, actors have allegedly dressed as Mickey, Elsa and Anna from Frozen, Captain America and Hulk from The Avengers, and Luke Skywalker and Darth Vader from Star Wars.

The contracts Defendants provided to their clients contained disclaimer language, stating, for example, that Defendants do not use trademarked and licensed characters. The contracts also warned clients that the costumes may differ from those seen in movies “for copyright reasons,” adding that “[a]ny resemblance to nationally known copyright character is strictly coincidental.”

These disclaimers did not appease Plaintiffs, who sent several cease and desist letters to Defendants before filing a federal copyright and trademark infringement suit and a New York trademark dilution suit.

While Judge Daniels from the SDNY granted Defendants’ motion for summary judgment and dismissed Plaintiffs’ claim for trademark infringement on August 9, 2018, he denied the motion to dismiss the copyright infringement claim and the trademark dilution claim.


The descriptive fair use defense failed

Plaintiffs claimed that the use of their trademarked characters to advertise and promote Defendants’ business, along with their portrayal by costumed actors, was likely to confuse consumers as to the origin of the services.

Defendants argued that their use of Plaintiffs’ characters was descriptive and nominative fair use, and that there was no likelyhood of confusion.

Descriptive fair use is an affirmative defense to a trademark infringement suit, as Section 33(b)(4) of the Trademark Act allows “use… of a term or device which is descriptive of and used fairly and in good faith [but] only to describe the goods or services of such party, or their geographic origin.” In other words, a defendant can use plaintiffs’ trademarks in a descriptive sense, or to describe an aspect of his own good or service.

For such a defense to succeed in the Second Circuit, a defendant must prove that the use was made (1) other than as a mark, (2) in a descriptive sense, and (3) in good faith (Kelly-Brown v. Winfrey at 308). This defense failed in the present case, as Defendants had not made a descriptive use of Plaintiffs’ marks. Instead, Judge Daniels found that their ads “were specifically designed to evoke [Plaintiff’s marks] in consumers’ minds…”


The nominative fair use defense also failed

Defendants also claimed that they used Plaintiffs’ marks to describe their own products. Such nominative fair use is a defense to a trademark infringement suit if such use “does not imply a false affiliation or endorsement by the plaintiff of the defendant” (Tiffany v. eBay at 102-103). But this nominative fair use defense also failed, as Defendants used Plaintiffs’ marks to identify their own service, which is hiring out characters for parties, rather than Plaintiffs’ trademarked characters.


Defendants’ use of characters was not trademark infringement

Judge Daniels used the eight-factor Polaroid test used by the Second Circuit in trademark infringement cases to determine whether Defendants’ use of Plaintiffs’ marks were likely to confuse consumers.

While Plaintiffs’ marks are undoubtedly strong (first factor), the similarity of the marks (second factor), weighed only slightly in Plaintiffs’ favor because Defendants used different names for their characters than Plaintiffs’ trademarked character names, e.g., “Big Green Guy,” “Indian Princess,” and “The Dark Lord” instead of Hulk, Pocahontas and Darth Vader.

The third and fourth Polaroid factors, the proximity of the goods and services, and the possibility that the senior user will enter the market of the junior user, were found to weigh in Defendants’ favor. There was no evidence that Plaintiff has plans to expand into the private entertainment service industry.

The fifth Polaroid factor, evidence of actual confusion, also weighed in Defendants’ favor, as there was no evidence that Defendants’ customers used the names of Plaintiffs’ trademarked characters when referring to Defendants’ services in online reviews or otherwise. Plaintiffs could not provide a survey proving customers’ confusion either.

Judge Daniels found the sixth factor, Defendants’ intent and evidence of bad faith, to also be in Defendants’ favor, since Defendants had put customers on notice that their services were not sponsored by or affiliated with Plaintiffs by using altered versions of Plaintiffs’ characters’ names and by removing Plaintiffs’ characters’ names in their online reviews.

The seventh Polaroid factor, the quality of Defendants’ products, was also in Defendants’ favor, as Defendants’ services, being of a lesser quality than Plaintiffs’, makes it likely that consumers will not be confused as to the source of the services.

The eighth Polaroid factor, consumer sophistication, also was in favor of Defendants, as Plaintiffs did not prove the sophistication level of Defendants’ relevant consumers.

Balancing these eight factors, the SDNY found no likelihood of consumer confusion and denied Plaintiffs’ motion for summary judgment on their trademark infringement claim.


Trademark dilution

Plaintiffs chose to claim trademark dilution under New York trademark dilution law, Section 360-1 of New York Business Law, and not under the Federal Trademark Dilution Act. This choice may have been made because the New York law does not require a mark to be famous to be protected, and a plaintiff only needs to prove the mark’s distinctiveness or secondary meaning.

Judge Daniels found that there was a genuine issue of fact as to whether Defendants’ use of Plaintiffs’ marks is likely to dilute Plaintiffs’ marks by tarnishment. A court will have to determine if Defendants provide services of poor quality.



Plaintiffs argued that Defendants had “copied and used the images, likenesses, personas, and names of Plaintiffs’ characters…to promote and advertise its services online.” Defendants argued in response that the characters in which Plaintiffs own copyrights are based on prior works that are part of the public domain.

Both parties will have more chances to pursue their arguments as Judge Daniels denied the motion for summary judgment on copyright infringement. He found that Plaintiffs had presented as evidence screenshots from Defendants’ website and videos allegedly published by Defendants which had not been properly authenticated. More specifically, they had not been authenticated by someone “with personal knowledge of reliability of the archive service from which the screenshots were retrieved,” citing Specht v. Google, a 2014 Seventh Circuit case.

It is likely that the parties will settle out of court.

Update on EU Copyright Reform – Audiovisual Media Services and Copyright in the Digital Single Market

By Martin Miernicki

Two important legislative projects in the field of European copyright law have recently undergone substantial developments. The revision of the Audiovisual Media Services Directive (AVMSD) is likely to be adopted later this year, whereas the proposed Directive on Copyright in the Digital Single Market (CDSMD) will be subject to an extended debate in the European Parliament.



In 2015, the Commission adopted its Digital Single Market Strategy for Europe, calling for a better harmonization of the copyright laws of the EU member states as well as for an enhanced access to online goods and services. Against this background, the proposal for a revision of the AVMSD as well as the proposal for the new CDSMD were made in 2016. The amendment to the AVMSD pursues a variety of policy goals, aiming at creating a clear level playing field for the provision of audiovisual content in the EU, especially addressing online services. The CDSMD contains provisions on the (collective) licensing of certain works and exceptions and limitation to rights of right holders provided for in other European directives on copyright, among these Directive 2001/29/EC. Title IV’s Art 11 and 13 of the proposed directive appear to be most controversial, providing for a right of publishers of press publications in the digital use of their press publications (so-called “link tax”) and an increased obligation of certain platform operators to monitor the content uploaded to their sites (so-called “upload filter”). Both reform projects have been intensely discussed and have yet to be adopted.


Current state of the reform projects

On 26 April 2018, the Commission announced a breakthrough in the negotiations with the Council and the European Parliament about the revision of the AVMSD, stating that a “preliminary political agreement” had been reached. This agreement was subsequently confirmed in June 2018. As regards the CDSMD, the Council’s permanent representatives committee agreed to its position on the draft directive on 25 May 2018 for the negotiations with the European Parliament. On 20 June 2018, the Parliament’s Legal Affairs Committee voted to start the negotiations, upholding controversial parts of the proposed CDSMD. However, the European Parliament rejected this decision in early July.


What can be expected?

Against the background of these developments, the revised AVMSD is likely to be adopted in autumn or winter, starting the period for the transposition of its rules into national law. It should be noted that Brexit also affects audiovisual media services, and it caused the Commission to publish a notice to stakeholders on this matter earlier this year. The future development of the CDSMD is less clear, as the proposal might be amended as a result of the upcoming debate. At this point, it is expected that the European Parliament will discuss the issue in September.

Full-work Licensing Requirement 100 Percent Rejected: Second Circuit Rules in Favor of Fractional Licensing

By Martin Miernicki

On 19 December 2017, the Second Circuit handed down a summary order on the BMI Consent Decree in the dispute between the Department of Justice (DOJ) and Broadcast Music, Inc. (BMI). The court ruled that the decree does not oblige BMI to license the works in its repertoire on a “full-work” basis.



ASCAP and BMI are the two largest U.S. collective management organizations (CMOs) which license performance rights in musical works. Both organizations are subject to so-called consent decrees which entered into force 2001 and 1994, respectively. In 2014, the DOJ’s Antitrust Division announced a review of the consent decrees to evaluate if these needed to be updated. The DOJ concluded the review in August 2016, issuing a closing statement. The DOJ declared that it did not intend to re-negotiate and to amend the decrees, but rather stated that it interpreted these decrees as requiring ASCAP and BMI to license their works on a “full-work” or “100 percent” basis. Under this rule, the CMOs may only offer licenses that cover all performance rights in a composition; thus, co-owned works to which they only represent a “fractional” interest cannot be licensed. In reaction to this decision, BMI asked the “rate court” to give its opinion on this matter. In September 2016, Judge Stanton ruled against the full-work licensing requirement, stating that the decree “neither bars fractional licensing nor requires full-work licensing.”


Decision of the court

On appeal, the Second Circuit affirmed Judge Stanton’s ruling and held that fractional licensing is compatible with the BMI Consent Decree. First, referencing the U.S. Copyright Act – 17 U.S.C. § 201(d) –, the court highlighted that the right of public performance can be subdivided and owned separately. Second, as fractional licensing was common practice at the time the decree was amended in 1994, its language does indicate a prohibition of this practice. Third, the court rejected the DOJ’s reference to Pandora Media, Inc. v. ASCAP, 785 F. 3d 73 (2d Cir. 2015) because this judgment dealt with the “partial” withdrawal of rights from the CMO’s repertoire and not with the licensing policies in respect of users. Finally, the Second Circuit considered it to be irrelevant that full-work licensing could potentially advance the procompetitive objectives of the BMI Consent Decree; rather, the DOJ has the option to amend the decree or sue BMI in a separate proceeding based on the Sherman Act.


Implications of the judgement

The ruling of the Second Circuit is undoubtedly a victory for BMI, but also for ASCAP, as it must be assumed that ASCAP’s decree – which is very similar to BMI’s decree – can be interpreted in a similar fashion. Unsurprisingly, both CMOs welcomed the decision. The DOJ’s reaction remains to be seen, however. From the current perspective, an amendment of the decrees appears to be more likely than a lengthy antitrust proceeding under the Sherman Act; the DOJ had already partly toned down its strict reading of the decree in the course of the proceeding before the Second Circuit. Yet, legislative efforts might produce results and influence the further developments before a final decision is made. A recent example for the efforts to update the legal framework for music licensing is the “Music Modernization Act” which aims at amending §§ 114 and 115 of the U.S. Copyright Act.

[1] For more information on the background see Transatlantic Antitrust and IPR Developments Issue No. 3-4/2016 and Issue No. 5/2016.


Is Embedding a Tweet on a Web Site Copyright Infringement?

By Marie-Andrée Weiss

A 5-page copyright infringement complaint filed last April in the Southern District of New York (SDNY) is being closely watched by copyright practitioners, as it may lead the court to rule on whether a Twitter post incorporating a copyrighted photograph, without permission of the author, is copyright infringement. The case is Goldman v. Breitbart News Network LLC et al., 1:17-cv-03144.

In the summer of 2016, Justin Goldman took a picture of the Boston Patriots quarterback, Tom Brady, walking in the streets in the Hamptons, in New York, with members of the basketball team the Boston Celtics. The picture was of interest as it could be implied from it that Tom Brady was helping the Celtics to acquire star player Kevin Durant.

The picture was published by several Twitter users on the microblogging site, and these tweets were then embedded in the body of articles about Tom Brady’s trip to the Hamptons published by Defendants including Yahoo!, Time, the New England Sports Network, Breitbart and others.

Justin Goldman registered his work with the Copyright Office and filed a copyright infringement suit against the platforms which had reproduced his photograph. Defendants moved to dismiss, claiming that the use was not infringing because it was merely embedding, and also because it was fair use. Judge Katherine B. Forrest denied the motion to dismiss on August 17, 2017, because whether embedding a tweet is equivalent to in-line linking could not be determined at this stage of the procedure.

Defendants, minus Breitbart, then filed a motion for partial summary judgment on 5 October 2017. Plaintiff moved to oppose it on 6 November 2017.


The Exclusive Right to Display a Work

Section 106(5) of the Copyright Act gives the copyright owner the exclusive right “to display the copyrighted work publicly.” Section 101 of the Copyright Act defines displaying a work as “to show a copy of it, either directly or by means of a film, slide, television image, or any other device or process or, in the case of a motion picture or other audiovisual work, to show individual images nonsequentially.” Plaintiff argues that “embedding” is one of the processes mentioned in Section 106(5).


Is Embedding a Tweet Just Like In-Line Linking?

Defendants claimed that incorporating an image in a tweet is not different from ‘in-line linking,’ which the Ninth Circuit found to be non-infringing in Perfect 10, Inc., v., Inc.. In this case, the issue was whether the thumbnail versions of copyrighted images featured by Google on its image search result pages were infringing.

The Ninth Circuit had defined “in-line linking” in Perfect 10 as the “process by which the webpage directs a user’s browser to incorporate content from different computers into a single window”. In this case, Google had provided HTML instructions directing a user’s browser to access a third-party website, but did not store the images on its servers. This was found not to be infringing, as Google did not store the images as it not have a have a copy of the protected photographs, and thus did not display then, since to “display” a work under Section 101 of the Copyright Act requires to show a copy of it. This reasoning is known as the “Server Test”.

Plaintiff distinguished the facts in our case from Perfect 10, claiming that his photograph was shown in full size, that it was not “framed” and that it was featured prominently on Defendant’s websites. He argued that the thumbnails in Perfect 10 were low-resolution pictures which users had to click in order to access the full photos, whereas an embedded tweet allows the user to see the full high-resolution image without further maneuvers.

Defendants argued instead that, similarly to the Perfect 10 facts, tweets were embedded using code which directed user’s browsers to retrieve the Tom Brady picture from Twitter’s servers, and the picture was indeed framed, with a light gray box. They had, as publishers, merely provided an in-line link to the picture already published by the Twitter users, and this was not direct copyright infringement. They argued that the embedded tweets were not stored on, hosted by or transmitted from servers owned or controlled by them.


Meanwhile, in the European Union…

Defendants argued that an embedded tweet functions as a hyperlink, since clicking on it brings the user to the Twitter site. This case is somewhat similar to the European Court of Justice (ECJ) GS Media (see here for our comment) and Swensson cases. In Swensson, the ECJ had found that posting a hyperlink to protected works which had been made freely available to the public is not a communication to the public within the meaning of article 3(1) of the InfoSoc Directive, which gives authors the exclusive right of public communication of their works. Recital 23 of the Directive specifies that this right covers “any… transmission or retransmission of a work to the public by wire or wireless means, including broadcasting.” The ECJ reasoned that providing a hyperlink is not a communication to a new public and is thus not infringing.

In GS Media, the ECJ found that posting hyperlinks to protected works, which had been made available to the public, but without the consent of the right holder, is not a communication to the public within the meaning of article 3(1) of the InfoSoc Directive either. However, if the links were posted by a person who knew or could have reasonably known that the works had been illegally published online, or if they were posted for profit, then posting these hyperlinks are a new communication to the public and thus infringing.

Could ECJ case law on hyperlinks inspire U.S. courts to revisit Perfect 10?