For years on the internet there has been a war between the holders of intellectual property rights, be they record labels, film producers, publishers or media, and the technological giants. And the contention reaches a new level with the universe created by generative artificial intelligence (IAG). Intelligences that, in order to function, need to be trained with billions of web content, texts, images and sounds, so that they can make their statistical inferences and give these answers that seem prodigious. Are tech companies taking advantage of other people’s intellectual property rights? Without using protected works for their training, as they can in many countries, would they have the same ability? Does training an artificial intelligence with copyrighted texts or images mean that later its answers will be derived? Is a generative artificial intelligence a cut and paste or can it be like Picasso entering the Prado, contemplating thousands of works and then creating?

The answers are not always simple, but we stand with our swords up. And authors and publishers, visual artists and media ask that it is necessary to request their authorization to use the contents and that the IAG products clarify that they are and what the training databases have been. EVA, the association of European Visual Artists, has launched the AI ??Manifesto, in which it reminds that there is no “artificial intelligence art without human artwork, and they cannot use it without permission “. And there is particular concern in the media not only that these new platforms can train systems with their news and use it to respond to users, but because they see a serious threat to democracy: the results they offer can be fed from sources not reliable, but, on the other hand, they are not subject to any responsibility of veracity.

Photo giant Getty has sued image generation platform Stable Diffusion, claiming it used 12 million of its images to train its AI model without permission or compensation. And comedian Sarah Silverman has sued Open AI (ChatGPT) and Meta, Mark Zuckerberg’s platform, to train, she says, their language models with her books.

Andrés Guadamuz, Professor of Intellectual Property Law at the University of Sussex, believes that the lawsuit against Meta is the one that may have the most appeal. “OpenAI points out that it used 85% content from the web, Wikipedia, Reddit, many databases, and 15% books to train the model. A part without rights and another that is not clear. But Meta did use a copyright-protected database. We are talking about 169,000 books”, he points out.

And remember that in many countries there are exceptions for these uses created to advance AI research. The EU, after the 2019 directive on copyright in the digital single market, allows it, he points out, “for scientific institutions, but also for companies like Meta, as long as they respect the opt-out – the right to request the exclusion – of the entitled”. The same happens in Japan or the United Kingdom, “but in the US it is still not very clear”. Although, he says, the case is even more complicated because “the database that Meta used was not created by them, but by a non-profit organization, EleutherAI”. “I think Meta’s defense will be that on the basis used, each of the books has no independent commercial value. They could try to reach a collective arrangement with the publishing houses to pay for a license. Another issue raised by the lawsuit is that the model is a derivative of the books used in the training. I don’t think so, and I don’t think any other judge does. Between 169,000, you take out Silverman’s books and the model works the same. What is interesting is having thousands and thousands of works, billions of words, and looking for statistical patterns. I don’t think we could legally talk about a derivative”.

In Europe, he says, with the current legislation the demands have a difficult path. “A photographer has sued the non-profit German image database Laion, but I don’t see a future for him.” He does acknowledge that “artists have a better economic case for going against the databases. There is no doubt that many illustrators will be out of work. They can request that existing exceptions be changed. And the same with musicians. AI companies have not wanted to release music products for fear of the industry, but they already have very good models”, he points out.

Jorge Corrales, director general of Cedro, an entity that manages the reproduction rights of books, magazines and newspapers, is clear that in this field “any use of protected content must be accompanied by a request for authorization from the authors or publishers. And we need to know what content is used in the product they are serving in the market”. And he points out that the current European directive is already obsolete. “Brussels is working for a new framework. There is draft text requiring machine-generated content to be identified as AI-generated. And that it provides a summary of the content used”.

From the Information Media Association (AMI), its general director, Irene Lanzaco, asks to respect “the right of information publishers to make the effort of the work undertaken by them their own and that all use by the generative artificial intelligence is subject to authorization from the publisher and the payment of appropriate financial compensation”. “And generative artificial intelligence must respond to the reliability of its information with the same scope as a press editor. We cannot continue to pretend that technological platforms have no responsibility for the content they spread, because they are in a position of absolute primacy. Profit cannot exist without responsibility. And it would be good to call on Europe to protect the value of its cultural creation. In the very essence of Europe is the cultural production that this continent has offered to the world. It is our main asset and we cannot trivialize it in this revolution”, he concludes.