Popular American comedian Sarah Silverman and authors Richard Kadrey and Christopher Golden have sued Meta and OpenAI for violating their copyrights by using the content of their books without permission to train artificial intelligence language models. The class action lawsuit was filed in federal court in San Francisco last Friday.
The creators of large language models are facing early lawsuits for unauthorized use of the massive amounts of copyrighted material they need for their applications to deliver realistic responses to user requests.
The lawsuit filed against Meta claims that leaked information about the company’s artificial intelligence business shows that its work was used without permission.
On the one hand, the brief against OpenAI alleges that the summaries generated by the ChatGPT chatbot of the plaintiffs’ work indicate that the bot was trained with their copyrighted content.
While the summaries may contain some erroneous details, they demonstrate that ChatGPT “withholds knowledge about specific jobs in the training data set,” according to the lawsuit. These artists are now demanding economic compensation since they are the owners of the copyrights of their works, which have allegedly been infringed.
The lawsuit against Meta alleges that the authors’ books were accessible in the data sets that Meta used to train its LLaMA models.
The lawsuit alleges that this data has illicit origins: In a Meta article detailing its LLaMA model, the company points to the sources of its training data sets, and one of them is ThePile. This source, according to the complaint, has been described as “a copy of the content of the Bibliotik private tracker”, and this is one of the well-known “shadow libraries”, which are “blatantly illegal”.
Last week OpenAI received two other lawsuits: one initiated by two other authors, Paul Tremblay and Mona Awad; and the other, a class action lawsuit accusing ChatGPT and Dall-e of violating the privacy and copyright of millions of internet users.