Reddit sues Perplexity and scraping firms over 'industrial-scale' data harvesting
2025–2026
In October 2025 Reddit sued Perplexity AI alongside scraping companies Oxylabs, AWMProxy, and SerpApi, alleging an 'industrial-scale' operation to harvest Reddit content from Google results and sell or use it.
What happened
On 22 October 2025 Reddit, Inc. filed suit in the U.S. District Court for the Southern District of New York (Case No. 1:25-cv-08736, captioned Reddit, Inc. v. SerpApi, LLC) against the AI search company Perplexity AI together with a group of data-scraping and proxy companies: Oxylabs UAB of Lithuania, AWMProxy, and SerpApi, LLC of Texas. The case was assigned to Judge Paul Engelmayer. Reddit alleged that the defendants had engaged in what it described as an 'industrial-scale' operation to harvest Reddit's content without authorisation.
According to the complaint, the defendants allegedly circumvented Reddit's protections by masking their identities, rotating IP addresses, and scraping Reddit content indirectly — including by harvesting it from Google search-result pages — and then using or selling that data. Reddit framed the scraping companies as supplying the infrastructure and Perplexity as a downstream beneficiary that used the improperly obtained content. Reddit's claims included a violation of the Digital Millennium Copyright Act's anti-circumvention provisions (Section 1201), along with unjust enrichment and unfair competition.
While Reddit's dispute with Perplexity itself had been the subject of earlier public friction, the inclusion of the proxy and scraping companies — Oxylabs, AWMProxy, and SerpApi — as named defendants was the distinctive feature of this action. It reflected Reddit's strategy of targeting not only the AI companies that allegedly use scraped content but also the intermediaries that allegedly enable large-scale collection, attacking the supply chain of unauthorised data harvesting.
The case was pending, with oral argument reported to be scheduled for mid-2026 and no ruling on the merits issued. Reddit's characterisations of the defendants' conduct as unlawful and 'industrial-scale' were its allegations, which the defendants would have the opportunity to contest; no court had found the defendants liable at the stage described.
For an archive of Reddit controversies, the case complements Reddit's separate suit against Anthropic and documents the company's broader litigation campaign over the value of its user-generated content in the AI era. It is best framed around the scraper and proxy co-defendants, since the friction with Perplexity specifically had already been covered elsewhere; the novel element here is Reddit's attempt to hold the data-harvesting intermediaries legally responsible. The entry illustrates how Reddit, having monetised licensed access to its content, pursued those it accused of taking that content for free, and how anti-circumvention law became a tool in disputes over AI training and search data.