Reddit Sues Perplexity AI and Scraping Vendors Over 'Data Laundering'
October 2025
Reddit sued Perplexity AI and three data-scraping intermediaries in federal court, alleging an industrial-scale scheme to harvest Reddit content from Google search results to feed AI products in violation of the DMCA.
What happened
On October 22, 2025, Reddit filed a lawsuit in the U.S. District Court for the Southern District of New York against Perplexity AI and three data-scraping intermediaries: Oxylabs (Lithuania), AWMProxy (described in the complaint as a former Russian botnet), and SerpApi (Texas). Reddit's central legal theory rested on the Digital Millennium Copyright Act's anti-circumvention provisions, accompanied by claims for unjust enrichment and unfair competition.
Reddit alleged that after AI firms were blocked from accessing Reddit directly, they obtained its content indirectly by harvesting Google's indexed search results at massive scale, with the named scraping vendors acting as brokers who gathered and resold the data. Reddit characterized the practice as 'data laundering' and described the defendants in court papers as 'would-be bank robbers,' alleging they masked their identities, rotated IP addresses, and harvested billions of Google search results containing Reddit posts and comments.
A central piece of Reddit's evidence was a 'honeypot' test: Reddit created a post made indexable only to Google's crawler and not otherwise publicly visible. Reddit said the post surfaced in Perplexity's results within hours, which it argued proved Perplexity's pipeline sourced content via Google rather than directly from Reddit. Reddit also alleged that after it sent a cease-and-desist letter, Perplexity's citations to Reddit content increased roughly fortyfold rather than decreasing.
Reddit framed the suit as protecting both its paid licensing partnerships (it has formal agreements with OpenAI and Google) and user privacy. Perplexity denied the allegations, accused Reddit of 'extortion' and of opposing an open internet, and argued it does not train AI models on the content but merely summarizes and cites publicly available Reddit discussions.
Impact
The suit became a leading test of whether the DMCA's anti-circumvention rules can police 'data laundering' through third-party scrapers, with potential to reshape how AI search engines source and cite user-generated content from licensed platforms.