Major publishers sue Meta over Llama AI training on books

TL;DR:

  • Hachette, Macmillan, McGraw Hill, Elsevier and Cengage — joined by bestselling author Scott Turow — have sued Meta and chief executive Mark Zuckerberg in Manhattan federal court, alleging “one of the most massive infringements of copyrighted materials in history” to train its Llama models.
  • The complaint claims Meta accessed millions of copyrighted books and journal articles from pirate sites, downloaded unauthorised scrapes of “virtually the entire internet”, stripped attribution data, and that Zuckerberg “personally authorised” the conduct after abandoning licensing talks.
  • Resultsense view: this is the first proposed class action filed by major academic and trade publishers — a different plaintiff profile from earlier author suits. UK publishing groups are watching closely; the outcome will shape licensing leverage in the next round of UK government AI-and-copyright consultations.

The case, filed on Tuesday, opens a new front in a string of copyright disputes brought against Meta, OpenAI, Microsoft and Anthropic over generative AI training data. The plaintiffs are seeking unspecified damages and aim to represent a broader class of copyright owners. Specific titles named in the complaint include N. K. Jemisin’s The Fifth Season and Peter Brown’s The Wild Robot.

Why this filing is different

Earlier suits brought by individual authors have produced split outcomes. In June 2025, Meta won a similar case from authors including Ta-Nehisi Coates and Richard Kadrey on fair-use grounds, with the judge calling the plaintiffs’ market-harm theory a “potentially winning argument” but underdeveloped. Anthropic, by contrast, settled a comparable suit for $1.5bn (£1.1bn) last year over its use of pirated texts.

The Manhattan complaint attempts to fix the evidentiary gap that lost the earlier case. The publishers argue that Llama is “an infinite substitution machine” producing imitation versions of their works, and that AI-generated books are “already flooding” Amazon’s marketplace in volumes that “materially displace human-authored works”.

Meta’s response

A Meta spokesperson told the FT it would fight the lawsuit “aggressively”, adding: “AI is powering transformative innovations, productivity and creativity for individuals and companies, and courts have rightly found that training AI on copyrighted material can qualify as fair use.”

UK relevance

UK publishers — including the Big Five trade houses and academic groups Elsevier and Taylor & Francis — have spent the past year pressing the UK government for stronger copyright protections in the AI training context. The Manhattan filing arrives as the government considers responses to its December 2024 consultation on text-and-data-mining exceptions, in which publishers and rights-holder groups have rejected the proposed opt-out model.

Looking forward

The case will turn on whether the new evidence on attribution-stripping and Zuckerberg’s personal involvement can move a US judge past the fair-use defence. For UK rights-holders, the practical near-term effect is leverage: with five of the world’s biggest publishers now litigating, vendor licensing conversations will start from a different baseline. UK corporate buyers using Llama in production should review their indemnification clauses now rather than later.