Reddit and the RSL licensing standard for AI scraping
September 2025
In September 2025 Reddit joined other publishers backing 'Really Simple Licensing', a standard meant to make AI crawlers pay for content — extending Reddit's effort to monetize user-generated posts as AI training data, and its enforcement war against unpaid scrapers.
What happened
On 10 September 2025 a group of major internet publishers — including Reddit, Yahoo, Medium, Quora, Ziff Davis, People Inc., O'Reilly Media, and others — announced support for a new protocol called RSL, for 'Really Simple Licensing,' alongside a nonprofit rights organization, the RSL Collective. The standard is built on the familiar RSS syndication format and is designed to go beyond the simple allow/deny of robots.txt by letting publishers attach machine-readable licensing and royalty terms to their sites, specifying how AI applications must compensate them for using content.
For Reddit, joining RSL was a logical extension of a strategy it had pursued aggressively since 2024: treating the collective output of its users as a licensable asset. Reddit had already signed paid data deals with Google and OpenAI and had taken legal and technical steps against companies it accused of scraping its content without paying. RSL offered a structural framework for that posture — supporting models from free and attribution-only access through subscription, pay-per-crawl (charging each time an AI system crawls content), and pay-per-inference (charging each time content is used to generate an AI response).
The enforcement question is the standard's central weakness, and it bears directly on Reddit. RSL relies on AI companies voluntarily honouring the licensing terms they encounter, yet AI developers have been repeatedly accused of ignoring robots.txt directives. To add teeth, the RSL Collective partnered with the infrastructure company Fastly to act as a gatekeeper, checking whether crawlers have agreed to licensing terms before granting access. Whether that gate holds against well-resourced AI firms determined to acquire training data remains unproven.
The deeper controversy is one that RSL embodies rather than resolves: who owns the value of user-generated content? Reddit's posts and comments are written by its users, yet the licensing revenue — whether through bilateral deals or a standard like RSL — accrues to Reddit, not to the people who wrote the content. RSL formalizes a marketplace in which platforms monetize their communities' contributions as AI fodder, intensifying the same distributional critique that has shadowed Reddit's other AI-data dealings.
RSL also crystallizes the adversarial turn in the relationship between content platforms and AI developers. Where the early internet ran on open crawling and free access, the AI era has pushed publishers toward toll-gating their content. Reddit's participation marks it as a leading actor in that shift — using its scale and its trove of human conversation as leverage to demand payment, while raising unresolved questions about enforcement, openness, and the rights of the users whose words are being sold.