← Back to all articles

Reddit Sues Anthropic Over AI Data Use

Posted 4 days ago by Anonymous

Legal Battle Over AI Training Data

Reddit has filed a lawsuit against Anthropic, alleging the AI company illegally used Reddit’s data to train its artificial intelligence models without proper licensing. The complaint, filed in Northern California court, marks the first time a major tech company has legally challenged an AI model provider over training data practices.

Content Licensing Dispute Details

In the legal filing, Reddit claims Anthropic violated its user agreement by scraping and using platform content for commercial AI training without authorization. The social media giant asserts that Anthropic continued accessing Reddit data even after being notified of the violation.

Ignoring Website Protocols

Reddit alleges that Anthropic’s scraper bots bypassed robots.txt files – standard protocols that instruct automated systems not to crawl websites. Despite claiming to have blocked its bots in 2024, Anthropic allegedly continued scraping Reddit over 100,000 times.

Growing Trend in AI Copyright Cases

This lawsuit follows similar legal actions against AI companies:

  • The New York Times sued OpenAI and Microsoft for using news articles
  • Authors including Sarah Silverman filed claims against Meta
  • Music publishers challenged AI audio/video generation startups

Reddit’s Licensing Strategy

Notably, Reddit has established official data partnerships with other AI companies:

  • OpenAI (where Sam Altman holds 8.7% stake in Reddit)
  • Google

These agreements reportedly include terms protecting user privacy and platform interests.

Legal Positions and Next Steps

Reddit’s Chief Legal Officer Ben Lee stated: “We will not tolerate profit-seeking entities commercially exploiting Reddit content without return for redditors.” The company seeks:

  • Compensatory damages
  • Restitution for Anthropic’s alleged gains
  • An injunction against further content use

Anthropic disputes the claims, with spokesperson Danielle Ghighlieri saying: “We disagree with Reddit’s claims and will defend ourselves vigorously.”

This case represents a significant legal test for AI training data practices as companies increasingly grapple with content licensing in the generative AI era.