AIs can generate near-verbatim copies of novels from training data
Researchers from Stanford and Yale found that top AI models memorize far more copyrighted content than their makers claim — Gemini reproduced nearly 77% of Harry Potter with high accuracy, and Claude yielded almost an entire novel verbatim when jailbroken. This directly undermines the industry's core legal defense that models "learn" without storing copies, at a time when Anthropic has already paid $1.5 billion to settle a copyright case and a German court ruled against OpenAI for memorizing song lyrics.