Mark Zuckerberg and Meta are facing serious legal and ethical accusations regarding their alleged use of pirated content to train their advanced AI model, Llama 3. New court filings suggest that Meta executives, including Zuckerberg, deliberately sought out high-quality content from notorious piracy sites such as LibGen and Anna’s Archive, which host millions of stolen books and research papers.
According to the court documents, Meta officials discussed the need for quality content in a highly critical email, noting that “Books are actually more important than web data.” To meet this demand, Meta is accused of turning to digital piracy hubs, without compensating authors, publishers, or researchers. LibGen alone contains over 7.5 million pirated books and 81 million stolen research papers, with Anna’s Archive offering similar content.
This controversial move has angered authors and creators, who argue that their intellectual property was taken without consent or compensation to fuel Meta’s AI advancements. Despite Meta’s impressive financial figures, with revenues exceeding $164 billion and profits nearing $62 billion in 2024, the company allegedly chose to avoid compensating content creators while using their work for AI training.
Critics argue that Meta, with its vast resources, could have chosen a more ethical path by creating licensed agreements that respected intellectual property rights. Instead, they claim the company opted for a shortcut in an attempt to boost its AI capabilities in the short term, rather than fostering long-term partnerships and collaboration.
Meta’s defense centers around the concept of “fair use,” claiming that its AI transforms the pirated content into something sufficiently new. However, legal experts warn that fair use typically applies to educational and critical uses, not to massive corporations profiting from unauthorized data scraping.
In addition, an investigation by the author of a Forbes article found that all five of their own published books had been pirated and included in Meta’s dataset, further highlighting the scale of the alleged infringement.
A class-action lawsuit has been filed, accusing Meta of copyright infringement and unfair competition. Experts suggest that this issue goes beyond Meta alone, pointing to the broader AI industry’s growing reliance on unlicensed content. There is an urgent need for ethical guidelines and clear regulations to ensure fair compensation for content creators and to protect intellectual property rights in the rapidly evolving AI landscape.
Be First to Comment