Judge Clarifies Fair Use in AI Training Amid Copyright Concerns

2025-06-25 general

San Francisco, Wednesday, 25 June 2025.
A U.S. judge rules Anthropic’s use of copyrighted books for AI training as fair use, but storing pirated copies is illegal, influencing potential liabilities in the tech and publishing sectors.

The Decision and Its Implications

On June 23, 2025, Judge William Alsup delivered a pivotal ruling clarifying the legalities surrounding AI training on copyrighted books. By deeming the use of such works as ‘fair use’ when it results in transformative applications, the judgment favors AI companies like Anthropic. However, the practice’s transformative nature does not absolve Anthropic of liability for pirating copyrighted content to build its dataset library. This distinction could certainly set formidable precedents that shape intellectual property laws in the tech and publishing industries [5][6].

Anthropic’s Legal Challenges

While Anthropic celebrated this aspect of the ruling, it faces substantial challenges due to accusations of pirating books. The lawsuit, filed in August 2023 by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, alleged that Anthropic’s AI models were trained on pirated copies from library sites like LibGen. The federal judge ordered a further trial to assess the potential damages, which could reach extraordinary amounts if deemed willful [1][4][7].

Fair Use Principles and Compliance

Judge Alsup’s decision emphasized that the creation of digital copies from legitimately acquired books aligns with fair use principles, drawing parallels with cases like Google Books. However, the significant difference lies in acquisition legality. Anthropic’s storing practices, described in the court’s ruling as ‘not fair use,’ highlight the critical distinction between legal transformation and illegal acquisition. Future compliance with copyright laws will likely involve detailed scrutiny over acquisition methods for data used in AI training [6][8].

Future Directions in Copyright Law

This ruling sets the stage for ongoing legal scrutiny regarding AI training data. The court’s application of the four-factor fair use test could become a blueprint for judging similar cases involving published materials. The emphasis on transformative use is seen as a potential advantage for AI companies; yet, the unlawful collection of such data continues to pose legal and ethical challenges. As legal frameworks evolve, the balance between innovation and copyright respect remains a dynamic and contested terrain [3][5].