Five major publishers are suing Meta over Llama

Elsevier, Cengage, Hachette, Macmillan, and McGraw Hill, joined by author Scott Turow, filed a proposed class action in Manhattan on Tuesday alleging Meta pirated millions of their works to train Llama. After Judge Chhabria’s June 2025 ruling, plaintiffs with stronger market-harm evidence have been waiting their turn.

On Tuesday morning, five of the world’s largest publishers and one of America’s best-known novelists walked into a Manhattan federal courthouse and filed a proposed class action complaint against Meta Platforms.

Reuters reported the case as Elsevier, Cengage, Hachette, Macmillan, and McGraw Hill, alongside the author Scott Turow, alleging that Meta pirated millions of their books and journal articles to train its Llama large-language models without permission, payment, or licence. The complaint asks the court to certify the case as a class action representing all similarly situated rights holders.

It is, by date, the latest in a long line of AI-training copyright cases. By substance, it is meaningfully different from most of those that have come before.

Why is this case not Kadrey

Anyone who has been following AI-copyright litigation will recognise the name in the precedent slot: Kadrey v. Meta. That earlier case, filed in 2023 in the Northern District of California by authors including Sarah Silverman, Richard Kadrey, Christopher Golden, Ta-Nehisi Coates, Junot Díaz, and Michael Chabon, made effectively the same allegations: that Meta downloaded copyrighted books from pirate libraries (LibGen, Z-Library, and Anna’s Archive) and used them to train Llama.

Court records cited by Tom’s Hardware established that Meta employees torrented roughly 82 terabytes of pirated material in the process. Mark Zuckerberg personally signed off on the use of LibGen for Llama training, despite internal AI executives flagging it as a “data set we know to be pirated” that could “undermine [Meta’s] negotiating position with regulators.”

And Meta won that case. In June 2025, Judge Vince Chhabria granted summary judgment for Meta on fair-use grounds, finding that the use of copyrighted books to train Llama was sufficiently transformative to clear the fair-use threshold. But Chhabria’s ruling was unusually narrow, and unusually candid about its limits.

He said publicly that Meta’s win “may be in significant tension with reality” and that the ruling applied only to the specific authors who had brought the case. He noted explicitly that future plaintiffs could succeed if they presented stronger evidence of market harm, the prong of fair-use analysis on which the Kadrey plaintiffs had, in his view, fallen short.

Tuesday’s filing reads, on first inspection, as exactly the kind of case Chhabria invited.

What the publishers are bringing that the authors did not

There are three structural differences between Kadrey and the new lawsuit, and all three favour the plaintiffs. The first is the catalogue. Where Kadrey involved roughly 666 specific books from a small group of individual authors, the new complaint covers the entire publishing operations of five companies that together account for a substantial share of the world’s academic, educational, and trade publishing output.

Per Reuters’ description of the complaint, titles include not only literary works such as N.K. Jemisin’s “The Fifth Season” and Peter Brown’s “The Wild Robot” but also textbooks, scientific journal articles, and reference works. The market for those works, particularly the academic and educational categories, is structurally different from the trade-fiction market that dominated the Kadrey plaintiff set.

The second is the market-harm evidence. Academic and educational publishers can document, in ways individual authors typically cannot, the specific revenue lines that AI-trained models substitute for. When Llama answers a student’s biology question that would otherwise have required consulting a Cengage textbook, the substitution is direct and measurable.

The plaintiffs will, on the standard pleadings strategy for a case of this kind, present that substitution as the kind of identifiable market harm Chhabria’s June ruling specifically identified as missing from Kadrey. Reed Smith’s analysis of the recent fair-use decisions noted that the market-harm prong, more than transformativeness, is now the operative legal battleground.

The third is the licensing-market context. Since 2023, AI companies have signed an increasing number of licensing deals with publishers. Meta itself has signed deals with Reuters, CNN, Fox News, People Inc., and USA Today for content licensing.

The existence of those licences is, in fair-use law, a significant fact: courts examining the market-harm prong now have evidence that a licensing market exists, that some publishers have priced and negotiated participation in it, and that Meta has chosen to participate in some markets while bypassing others. The new plaintiffs will argue that bypassing them while licensing others is itself evidence of bad faith.

The Anthropic settlement, in the background

Tuesday’s case lands against another piece of recent precedent. Anthropic, in a settlement the Authors Guild publicly described as significant, agreed earlier this year to pay authors as part of resolving the Bartz v. Anthropic class action over similar allegations. The settlement amount and terms set a marker for what AI-training copyright cases can produce when they reach a financial resolution rather than an early summary judgment.

TNW has tracked Anthropic’s broader commercial trajectory through the parallel $1.5bn enterprise services joint venture, the IPO preparations, and the model-deployment programmes; the Bartz settlement is, in financial terms, a manageable line item against that backdrop. For Meta, with its different fact pattern and its prior summary-judgment win, the calculus is different.

Settlement is, however, only one possible outcome. The other is that Meta tries the case the way it tried Kadrey, betting that the fair-use defence will hold even against more substantive plaintiffs. The risk in that strategy is asymmetric. A second loss for the publishers would, in effect, settle the question for the entire market: Llama-style training on pirated corpora is fair use even when the plaintiffs are a major industry. A second win would cost the company more in damages and structural remedies than the first case avoided.

Meta’s wider legal-cost trajectory

The new case sits inside a broader legal landscape Meta has been navigating for some time. TNW reported last week on the Meta-New Mexico phase-two trial in Santa Fe, in which the state is seeking algorithm changes, age-verification mandates, and a $3.7bn teen mental-health fund tied to the company’s youth-safety record.

TNW’s analysis earlier this year noted that Meta’s mounting child-safety legal exposure could, eventually, cost more than its $145bn AI capex programme. Meta’s Q1 2026 capex guidance is now between $125bn and $145bn for the year, an order of magnitude that makes any single litigation outcome look small in absolute terms but that also raises the question of how many simultaneous fronts the company can accept legal exposure on without commercial consequences.

There is also the broader regulatory backdrop. TNW has covered Anthropic’s Mythos and the Eurogroup’s parallel concerns about AI capability and access; that is a different set of regulatory concerns from copyright, but it is part of the same wider story about how AI companies’ commercial speed is colliding with multiple categories of slower-moving legal infrastructure. The publishers’ Tuesday filing is the copyright instance of that collision.

What the case is really asking

The narrow legal question is whether the use of pirated copyrighted material to train Llama constitutes fair use under US copyright law. The wider question, the one the publishing industry is actually trying to settle, is whether the existing fair-use doctrine, written before generative models existed, can be stretched to accommodate them or whether some new framework, statutory or judicial, has to be built. The Kadrey ruling stretched the doctrine. The new case will test how far the stretch will go.

If the publishers win, even partially, the licensing market for AI training data becomes a structural fixture of the industry, with material commercial implications for every model company currently relying on broadly scraped corpora. If they lose, the practice of training on pirated material at scale becomes effectively legally durable in the United States, with the regulatory response shifting to legislatures rather than courts.

The procedural calendar will move slowly. Class certification, motions to dismiss, summary-judgment briefing, and trial scheduling will, in the ordinary course, take 18 to 24 months. Investing.com flagged the broader market-screener context around the lawsuit’s announcement, noting that several other AI-training copyright cases are now moving through US courts simultaneously, with some likely to reach the appellate level before this one is resolved. The Tuesday filing is, in that sense, a long bet rather than an immediate threat.

It is, however, the most credible long bet the publishing industry has yet placed against an AI-training defendant. After Kadrey produced what the Authors Guild called a “technical win” for Meta but a substantive opening for future plaintiffs, the industry has been waiting for the right plaintiff slate to bring the next case. Tuesday’s filing names that slate.

The litigation that follows will, over the next two years, decide whether Llama’s training corpus, and by extension that of every comparable model trained on similarly broad scraped data, was the original commercial sin of the AI cycle or its first widely accepted standard practice. There is no third outcome the courts can produce.

Meta will argue, as it argued in Kadrey, that the use is transformative and that no measurable market harm has occurred. The publishers will argue, with documents, accounting, and licensing comparables, that the harm is precisely measurable and that Meta’s selective licensing across the industry establishes a market against which its non-licensing of their works can be valued. Judges Chhabria’s earlier ruling has, in effect, written the brief for both sides.

Whoever sat reading his June opinion most carefully has, on the present evidence, sat down at the publishers’ table on Tuesday.

Story by Alina Maria Stan

Alina Maria Stan builds connections that people actually feel. As co-founder and COO of Tekpon, she turns product intuition into real moment (show all) Alina Maria Stan builds connections that people actually feel. As co-founder and COO of Tekpon, she turns product intuition into real moments of discovery, shaping how teams find and adopt SaaS every day. Since 2020, she has led Tekpon’s brand voice, media strategy, and growth plays with a clear focus on human outcomes behind every metric. Before Tekpon, Alina followed curiosity across industries and countries. She was CEO of King Casino Bonus and led affiliate and brand strategy at Extremoo Media and Fable Media in Denmark, where she learned how to build partnerships that last. Early on, she sharpened her CRM and pricing instincts at K.H. ApS, always asking why customers choose what they choose. Her approach is rooted in more than a decade of international experience and two master’s degrees, one in Sustainable Consumption from the Technical University of Munich and one in Consumer Affairs Management from Aarhus University.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Five major publishers are suing Meta over Llama. They have evidence that the previous plaintiffs did not.

Why is this case not Kadrey

What the publishers are bringing that the authors did not

The Anthropic settlement, in the background

Meta’s wider legal-cost trajectory

What the case is really asking

Get the TNW newsletter

Skoda’s Peaq is a seven-seat electric SUV built to undercut the Kia EV9 and Ioniq 9 on price

Crypto exchanges promised users access to the SpaceX IPO. The tokenized shares never arrived.

Discover TNW All Access

Anthropic’s model shutdown just handed India’s sovereign AI movement its strongest argument yet

Why Apple built a third-party AI system for Siri and then refused to show it at WWDC