Photo: 413459012-scaled

The Rights Alliance cannot support the proposed implementation of the EU AI Regulation

Apr 14, 2025 | Artificial Intelligence

14. April 2025

The approach taken in the AI Office’s proposed code of practice for providers of general-purpose AI models is so inadequate that no code at all would be preferable.

Effective exercise and enforcement of copyright in the context of generative artificial intelligence requires sufficient transparency regarding the works and content used to train AI models. The EU AI Act includes provisions aimed at ensuring such effective rights enforcement. These provisions require providers of AI models to adopt a policy for complying with copyright and related rights, as well as to prepare and publish a sufficiently detailed summary of the training data used.

The EU AI Office established a working group tasked with drafting a voluntary code of practice that AI model providers could sign to demonstrate compliance with these obligations. Throughout the process, Rights Alliance—among other stakeholders—has submitted ongoing input and feedback.

Unfortunately, the current and final draft of the code leaves rightsholders without any real possibility of exercising and enforcing their rights. This is despite the fact that we, along with numerous other rightsholders, have consistently raised concerns about the weak obligations proposed by the AI Office.

Rights Alliance therefore supports the joint statement sent today to the EU AI Office by a coalition of European rightsholders. The message is clear: no code is better than the one proposed in this third and final draft.

Read the statement here

Popular AI services are trained on illegal content

The Rights Alliance has consistently highlighted how AI providers train their models on copyrighted content without permission. This is documented in our latest report from March, which compiles our investigations into the most prominent AI models and reveals how all of them have collected and trained on illegal content sourced from well-known piracy platforms.

The widespread exploitation of unlawful copies for AI training purposes underscores the urgent need for effective tools that enable rightsholders to detect and enforce their rights.

Table: Summary of the findings in the report on pirated content used for training generative AI.
The table shows which providers and AI models have been trained using specific datasets containing illegal content.

We have presented our findings to EU policymakers and included them in our contributions to the AI Office’s code of practice working group and transparency template. Unfortunately, this has not had the desired impact on the development of the code of practice.

Danish books used to train Meta’s AI

Meta is one of the providers that has trained its generative AI models using illegal content from piracy platforms. In our report, we document how Meta’s LLaMA model has been trained on five datasets containing content from such platforms. One of these datasets originates from the file-sharing service LibGen, which is dedicated to distributing illegal copies of books, including works published and written by Danish publishers and authors.

The Atlantic has reported that Meta trained its generative AI model on “millions of pirated books” from LibGen. In connection with this, The Atlantic has published a searchable database of the dataset’s contents, allowing anyone to look up the authors and books that have been used to train Meta’s LLaMA model. Among them are many titles whose rights are owned by Danish authors and publishers.

See which books are available in LibGen here

Rights Alliance has previously contacted Meta, as part of a broader effort concerning the Books3 dataset, requesting that the company stop using illegal AI training data. However, Meta has chosen not to respond. Instead, our report, along with The Atlantic’s coverage, shows that the AI provider continues unabated to train its generative AI on illegal content.

Meta is being sued by authors in the United States over its use of Books3.

The French author and publisher organisations SNE, SNAC, and SGDL also filed a lawsuit against Meta in France earlier in March, for the alleged use of their copyrighted content to train Meta’s AI models without permission.