Anthropic will pay $1.5 billion to settle a class action copyright lawsuit after the AI firm downloaded millions of pirated books and used them to train its Claude large language models (LLMs.)
Its lawyers filed for preliminary approval of the sum on Friday with a San Francisco federal judge. Trial had been set for December. (The case is Bartz et al v. Anthropic PBC.)
Anthropic’s move comes after the judge overseeing the case, William Alsup, in June issued a split decision on the case.
He determined that 1) Anthropic’s use of the books to train LLMs was “spectacularly” transformative and therefore a fair use; 2) Anthropic’s digitisation of books it purchased in print form was fair use because the digital copies were a replacement of print copies it bought; 3) Anthropic’s use of “pirated” copies of books in its central library was infringing.
His decision was one of the first substantive US court rulings on whether the use of copyrighted works to train LLMs for generative AI is infringing copyright, or fair use.
"Anthropic’s next pirated acquisitions..."
Judge Alsup's summary judgement had showed that in 2021, Anthropic cofounder Ben Mann, downloaded Books3, “an online library of 196,640 books that he knew had been assembled from unauthorized copies of copyrighted books – that is, pirated.”
As Judge Alsup wrote: “Anthropic’s next pirated acquisitions involved downloading distributed, reshared copies of other pirate libraries. In June 2021, Mann downloaded in this way at least five million copies of books from Library Genesis, or LibGen, which he knew had been pirated.
"And, in July 2022, Anthropic likewise downloaded at least two million copies of books from the Pirate Library Mirror, or PiLiMi, which Anthropic knew had been pirated…”
Join peers following The Stack on LinkedIn
Anthropic's decision to settle comes days after it raised $13 billion in a Series F round co-led by Fidelity and Lightspeed Venture Partners that valued it at a massive $183 billion.
(The company has grown revenue run rate from $1 billion to $5 billion in just eight months this year, it said on September 2. Anthropic now serves over 300,000 business customers.)
Representing the plaintiffs, lawyer Justin Nelson of Susman Godfrey told The Stack in an emailed comment: “This landmark settlement far surpasses any other known copyright recovery. It is the first of its kind in the AI era. It will provide meaningful compensation for each class work and sets a precedent requiring AI companies to pay copyright owners.
“This settlement sends a powerful message to AI companies and creators alike that taking copyrighted works from these pirate websites is wrong.”
The class action saw lawyers argue that Anthropic’s LLMs could displace demand for their books, and that its unauthorised use has the potential to displace an emerging market for licensing their work to train LLMs.
Days after Judge Alsup’s decision this summer, another district judge in a separate case came to a different conclusion: Judge Vince Chhabria, in Kadrey v. Meta Platforms, dismissed similar copyright infringement claims. (Several well-known authors had sued Meta for downloading their work from “shadow libraries” and using it to train the Llama models.)
Chhabria concluded, simply, (as law firm Jackson Walker put it in a July 11 analysis) that the thirteen plaintiff authors “did not provide evidence that Meta’s use of their works resulted in any market harm, either by causing the AI to reproduce substantial portions of their books or by negatively affecting the market for licensing books as AI training data” – and that Meta’s LLMs could only output trivial snippets of the plaintiffs’ works.
Judge Chhabria said his decision “does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful.
“It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one.”
Anthropic commented today that “we remain committed to developing safe AI systems that help people and organizations extend their capabilities, advance scientific discovery, and solve complex problems.”
Sign up for The Stack
Interviews, insight, intelligence, and exclusive events for digital leaders.
No spam. Unsubscribe anytime.