Anthropic Settles for $1.5B Over Pirated Books Used in AI Training

Anthropic Settles for $1.5B Over Pirated Books Used in AI Training

Anthropic has reached a groundbreaking settlement of $1.5 billion to resolve a lawsuit concerning its use of copyrighted books to train its AI chatbot, Claude. A legal filing recently made public details this significant agreement.

This case began when authors and publishers accused Anthropic of utilizing over 7 million copyrighted books without permission. A federal judge ruled in June that while Anthropic’s training methods fell under fair use, maintaining these digital works in a “central library” violated copyright law. The judge also suggested that company executives were aware they were downloading pirated content, prompting a trial initially set for December.

The proposed settlement, presented for approval to a federal judge, would provide $3,000 (about €2,800) per book to numerous authors, marking a monumental payout in U.S. copyright history. However, it is worth noting that previous cases have seen higher per-work payments. For instance, in 2012, a Minnesota woman faced fines of approximately $9,000 (around €8,400) per illegally downloaded song after an initial penalty exceeding $60,000 (around €56,000).

In a statement to Gizmodo, Anthropic highlighted the earlier ruling that deemed their training process as fair use. “In June, the District Court issued a landmark ruling on AI development and copyright law, finding that Anthropic’s approach to training AI models constitutes fair use,” stated Aparna Sridhar, deputy general counsel at Anthropic.

Sridhar elaborated, “Today’s settlement, if approved, will resolve the plaintiffs’ remaining legacy claims. We remain committed to developing safe AI systems that help people and organizations extend their capabilities, advance scientific discovery, and solve complex problems.”

According to the legal filing, the settlement payments will be disbursed in four installments based on court-approved milestones. The first payment of $300 million (approximately €280 million) will occur within five days of the court’s preliminary approval of the settlement, followed by a second payment of another $300 million. Subsequently, $450 million (about €420 million) will be due, with interest, within a year after the preliminary order, and the remaining $450 million (approximately €420 million) within the following year.

Anthropic, boasting a recent valuation of $183 billion, continues to face lawsuits from various companies, including Reddit. The social media platform recently struck a deal with Google to permit the training of AI models on its content. Additionally, authors are pursuing active cases against other major tech firms like OpenAI, Microsoft, and Meta.

The June ruling clarified that Anthropic’s use of copyrighted books for AI training could be considered fair use under U.S. copyright law, stating that the act of reading “all the modern-day classics” and emulating them would not violate copyright laws:

…not reproduced to the public a given work’s creative elements, nor even one author’s identifiable expressive style…Yes, Claude has outputted grammar, composition, and style that the underlying LLM distilled from thousands of works. But if someone were to read all the modern-day classics because of their exceptional expression, memorize them, and then emulate a blend of their best writing, would that violate the Copyright Act? Of course not.

“Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them—but to turn a hard corner and create something different,” the court noted.

By this legal reasoning, Anthropic merely needed to purchase the books it previously pirated to comply with copyright law, a cost likely lower than $3,000 (approximately €2,800) per book. However, the New York Times indicates that this settlement will not set any legal precedent for future cases, as it is not proceeding to trial.

What does this settlement mean for the future of AI and copyright law? It raises significant questions regarding how emerging technologies interact with existing intellectual property rights. As developments unfold, it’s crucial to remain informed about these evolving legal landscapes.

How might this settlement influence other tech companies? Other firms, like OpenAI and Microsoft, are likely observing closely. This could set a new standard in how AI models are trained, potentially affecting various industries and creative sectors.

Are there other ongoing copyright issues related to AI? Yes, numerous lawsuits are still pending, reflecting the growing tension between technological advancement and intellectual property protections.

What can authors do to protect their work in the age of AI? Authors should stay vigilant, engage in discussions about AI ethics, and advocate for stronger protections against unauthorized use of their creations.

As we navigate this intricate web of technology and rights, it’s essential to keep informed and engage with related topics. For further insights, feel free to explore more at Moyens I/O.