Generative Artificial Intelligence (AI) models are systems pre-trained using machine learning on large datasets. They break down the content of training data into small pieces of information, reassemble them, and generate new data based on these patterns.
One of the key questions regarding AI and copyright is whether training AI models with copyrighted data without the consent of the rights holders is permissible.
There is broad agreement that the creation of datasets for training purposes constitutes a copyright-relevant act of reproduction, which falls under the author’s exclusive exploitation rights (§§ 15 et seq. German Copyright Act). Since AI training is typically conducted without the rights holders’ consent, its legality depends on whether a statutory exception to copyright protection (so-called copyright limitations) applies in favor of AI operators. Unlike in the United States, Germany does not have a general fair use doctrine. Moreover, copyright law – at least so far – does not provide for a specific “AI exception.” Therefore, the use of copyrighted works for AI training is only lawful if one of the copyright limitations regulated in the German Copyright Act applies. In this context, the applicability of the text and data mining (TDM) exceptions is particularly debated. These exceptions allow the automated analysis of digitized works to identify patterns and correlations, either for scientific purposes (§ 60d German Copyright Act) or for other (including commercial) purposes (§ 44b German Copyright Act).
In September 2024, the Hamburg Regional Court became the first German and European court to address the copyright implications of using training data and text and data mining for AI training purposes (Case No. 310 O 227/23). The court ruled that the creation of AI training datasets does not infringe the copyrights of third parties, provided that it serves scientific research purposes within the meaning of § 60d German Copyright Act. The case involved a lawsuit filed by a photographer against a non-profit organization that had made datasets publicly available for AI research. One of the datasets contained an image created by the plaintiff.
The Hamburg Regional Court confirmed in its ruling the general applicability of TDM exceptions to the creation of datasets for AI training purposes—not only for scientific research but also for commercial training purposes. The court based its reasoning, among other things, on the AI Regulation (EU Regulation 2024/1689), which came into force in August 2024. Article 53(1) of this regulation explicitly states that the creation of datasets for training artificial neural networks falls under the TDM exceptions. Additionally, the AI Regulation obliges AI model providers to respect the usage restrictions set by copyright holders of digital works.
The Hamburg Regional Court built on this requirement, emphasizing the importance of usage restrictions in protecting the rights of authors of publicly accessible digital works. Copyright holders can—and should!—include an exclusion statement on their respective websites, such as in the imprint, terms of use, or in a machine-readable format, specifying which works and usage actions are covered by the restriction.
This ruling provides initial clarity and a degree of legal certainty in an emerging legal field. It thoroughly examines the technical aspects and legal implications of AI. While the court adopts a broad interpretation of the concept of text and data mining (TDM), it simultaneously stresses the need to adequately consider the interests of rights holders. A key aspect of the ruling is the strict distinction between scientific TDM, which is permitted regardless of usage restrictions, and commercial TDM, which is only allowed if no valid usage restriction is in place. This distinction is particularly relevant in the commercial sector, where the creation of training datasets requires significant resources and holds substantial innovation potential.
It remains to be seen how other courts will rule on the use of copyrighted works for training AI models. An appeal has already been filed against the ruling of the Hamburg Regional Court, meaning that the Hamburg Higher Regional Court will now have to address the legality of dataset creation for AI training purposes.
Additionally, the German music rights organization GEMA filed lawsuits against OpenAI in November 2024 and against Suno Inc. in January 2025 at the Munich Regional Court. The lawsuits allege the unauthorized use of copyrighted song lyrics and music compositions for AI training purposes. These proceedings will primarily focus on the remuneration of authors of digital works, as well as issues related to digital usage restrictions and licensing.