Eddie Suarez and Sara Mieczkowski / https://suarezlawfirm.com
AI copyright litigation is not our usual territory. As a rule, we defend clients in federal and state courtrooms against the government in high-stakes cases. But our firm prides itself on always being in touch with our inner tech nerd, and this has allowed us to recognize the benefit to our clients of applying appropriate technological developments. We have often been early adopters of new technological advances, from e-discovery tools to appropriate integration of LLMs into our workflows. This tech nerdiness has served our clients well and has helped us to “punch above our weight.” So, when a lawsuit drops that has the potential to redraw the rules for how AI systems are allowed to operate, our inner tech nerd pays attention. This past Thursday, May 28, 2026, CNN v. Perplexity was filed in the Southern District of New York, and it is one of those cases.
This is not your typical AI copyright case
Most of the AI copyright litigation you have heard about targets what happens when a company builds an AI model. The authors’ suits and headline cases like the New York Times’ action against OpenAI all center on whether scraping huge volumes of copyrighted text to train a model is infringing. In several major GenAI training cases in 2025, courts held that training on lawfully obtained works was highly transformative and qualified as fair use, at least on the records before them, while a separate line of authority in Thomson Reuters v. Ross took a more skeptical view.
CNN v. Perplexity targets something different: what happens when a model runs. Perplexity does not just train on content and let a model generate from its own memory. It uses a technique called retrieval-augmented generation, or RAG. Every time a user asks a question, Perplexity reaches out in real time, grabs live content from across the web (including paywalled articles), feeds it into the model as context, and generates an answer built directly from what it just copied. The copying is not historical. It happens at the moment of the query.
That distinction matters. The fair use arguments that protected AI companies in the training cases rest on the idea that ingesting text to learn patterns serves a different purpose than the original text served. A novel exists to be read. A model trained on novels learns something about human language, which is a different and arguably transformative use. But a retrieval copy of a CNN article exists for one reason: to deliver the substance of that article to the reader — the same purpose as the Perplexity user.
What CNN alleges, and why the details are damaging
The complaint is well-crafted and detailed and, in places, stark. CNN alleges that Perplexity unlawfully copied over 17,000 CNN articles, videos, and other works to power its products. It includes side-by-side comparisons showing Perplexity’s output next to the original CNN text, where the two are nearly identical. The examples are not cherry-picked edge cases. CNN’s complaint includes exhibits showing Perplexity’s Comet browser displaying verbatim versions of paywalled CNN articles, effectively bypassing CNN’s subscription wall, according to CNN.
The complaint also alleges stealth crawling: that Perplexity evaded CNN’s robots.txt blocks using undisclosed IP addresses and undeclared crawlers designed to impersonate ordinary browsers. This matters beyond the ethics of it. Courts in the Second Circuit treat the defendant’s conduct as part of the fair-use analysis. When a defendant could have obtained the work legitimately through a license but instead evaded an express block after being told not to proceed, that weighs against fair use. CNN sent a cease-and-desist letter in December 2025. Perplexity never responded and kept going.
There is also a trademark angle that is less discussed but potentially significant. CNN alleges that Perplexity’s chatbot generated fabricated content falsely claiming that users could access CNN’s premium subscription content through Perplexity, an affiliation that does not exist. The complaint also documents instances of hallucinated quotes attributed to a named individual, with CNN’s trademark displayed alongside them, and Perplexity then suggesting the garbled quotes might reflect CNN’s own transcription failures. A court in a parallel case, Advance Local Media v. Cohere, recently held that this theory survives dismissal under the Lanham Act: misattributing fabricated content to a news organization’s trademark is actionable.
The legal questions nobody has answered yet
Output infringement is not itself a new theory. Courts have held that AI outputs can infringe when they reproduce protectable expression. That claim survived dismissal in the OpenAI MDL and in the Dow Jones case against Perplexity itself. But no court has yet decided, on a full record, whether real-time retrieval-and-reproduction is fair use. The key fair-use rulings that have favored AI companies so far have all been about training, not deployment. This case is about deployment, and those rulings do not transfer cleanly.
The fourth fair-use factor, market harm, may be the most consequential piece of new ground. Courts in the training cases rejected the idea that a licensing market for AI training existed, calling the argument circular: you cannot define the market by assuming infringement. But Perplexity itself created a licensing market for retrieval and display. It launched a Publishers’ Program to share revenue with content creators. It negotiated a short-form agreement with CNN. The deal fell apart, CNN walked away, and Perplexity kept scraping anyway. Whether courts treat that functioning market as decisive on factor four is an open and potentially field-shaping question.
The willfulness exposure is also worth noting. CNN holds registered copyrights for over 17,000 works at issue. Willful infringement in the Second Circuit raises the statutory damages ceiling from $30,000 to $150,000 per work. The math speaks for itself. It is the same dynamic that produced a substantial settlement in the Bartz case against Anthropic, where Anthropic won on the training fair-use question but still faced catastrophic exposure on separate piracy claims, ultimately settling for $1.5 billion to avoid a trial on those surviving issues.
Why this matters beyond the parties
If a court reaches the merits and holds that real-time retrieval-and-reproduction is a market substitute rather than a transformative use, the impact falls on the most visible and consumer-facing layer of AI: the part that answers your questions by pulling from live sources. The likely industry response is already taking shape. It points toward licensed content arrangements, output limits, stricter anti-reproduction filtering, and behavior that sends users to the original source with a short attributed pointer rather than substituting for it.
That is not a shutdown of AI-assisted research. It is a structural shift from scrape-and-reproduce toward license, filter, and link back. And it is worth being clear about who that protects: the journalists, photographers, and editors doing expensive, dangerous, original work that no model can replace. CNN notes in its complaint that it maintains 36 bureaus worldwide and sent reporters to Tehran during active conflict earlier this year. That kind of reporting does not emerge from a training dataset.
Something to keep in mind is that the major AI copyright cases keep settling before any court rules on the hardest questions. The foundation for fair use in AI training cases survives, especially where models are trained on lawfully obtained works, but courts have been clear that piracy and market harm remain live fault lines. What is being tested here is whether that foundation also licenses a product built to hand a consumer the substance of a current, paywalled article. That question may, once again, go unanswered on the merits. But the pressure the litigation creates, and the terms any settlement would impose, move the industry in the same direction regardless.
The bottom line
CNN v. Perplexity is not a case about whether AI companies can build powerful models. That question has been answered, largely in the industry’s favor. This case is about whether those models can operate by continuously copying, retrieving, and reproducing live, original journalism without paying for it, and doing so after being expressly told to stop.
How courts answer the deployment question is likely to shape what AI-assisted research looks like for years to come, and that is something every practitioner and researcher paying attention to technology should understand. We will be watching. Not because copyright litigation is our practice, but because we are a curious bunch and we love to do periodic research and deep dives into subjects that interest us. We also think it is important to validate results of LLM-assisted research by running validation protocols in a different LLM. We have found Perplexity to be a strong model. While it is hard not to recognize the merits of the arguments advanced by CNN, we selfishly love the benefits presented by a service like Perplexity, which is not only a highly effective research tool, but also provides users with citations, making verification easier. In short, not as a legal opinion, but as a matter of feeding our inner tech nerd, we are rooting for Perplexity.
