![]() |
| Daniel J. Escott |
But on Nov. 13, 2025, in a courtroom in the Southern District of New York, that veneer of safety was stripped away.
In a decision that should send a chill through every general counsel, managing partner and chief information officer in Canada, Judge Colleen McMahon denied Cohere’s motion to dismiss a sweeping lawsuit brought by a coalition of major publishers, including Condé Nast, Forbes and The Atlantic. The order in Advance Local Media LLC v. Cohere Inc. is not merely a procedural hurdle for a vendor; it is a flashing red light for every organization currently integrating “black box” large language models (LLMs) into their workflows.
ILYA SEDYKH: ISTOCKPHOTO.COM
The myth of the ‘safe’ summary
For years, the sales pitch for enterprise LLMs has relied on a comforting, albeit legally untested, assumption: that using AI to “summarize” vast amounts of external data is a transformative use that shields the user from copyright liability. Law firms, courts and financial institutions have adopted these tools largely for this purpose, to digest market reports, summarize case law or condense news feeds into actionable intelligence.
The Cohere ruling has shattered this assumption.
In its defence, Cohere argued that its model’s outputs were merely factual digests, unprotectable recitations of facts that could not infringe copyright. Judge McMahon explicitly rejected this defence. She found that AI-generated “substitutive summaries” can, and do, infringe copyright if they replicate the expressive structure, pacing and narrative choices of the original work.
The court noted that “it is not possible to determine infringement through a simple word count,” effectively ruling that an AI can paraphrase every single sentence of a document and still infringe if it parrots the author’s expressive choices. This concept of the “substitutive summary” is a poison pill for enterprise AI adoption.
Consider the implications for a Canadian law firm or government agency. If your staff uses a tool like Cohere’s “command” model to generate summaries of paywalled content, copyrighted market analyses or proprietary legal commentary without a specific licence, you are no longer just consuming information. You are potentially trafficking in infringing derivative works. The very feature that made the tool attractive, its efficiency, has now become a generator of liability.
This ruling suggests that the “fair dealing” or “fair use” defence for AI summarization is far narrower than the industry has claimed. If the summary serves as a market substitute for the original and saves you the trouble of reading (and paying for) the source text, it is likely infringing. For an enterprise user, the “black box” that generates that summary is no longer a productivity tool; it is a liability engine running inside your firewall.
Hallucinations are no longer just ‘glitches,’ they are trademark violations
Perhaps the most damning aspect of the Cohere decision, particularly for the legal profession, is the court’s treatment of AI “hallucinations.”
Until now, the industry has largely treated hallucinations, being the tendency of LLMs to invent facts and sources, as a technical glitch or “quirk” of the technology that requires human oversight. Judge McMahon’s ruling completely reframes this risk. The court allowed trademark claims against Cohere to proceed precisely because of these hallucinations.
The plaintiffs alleged that Cohere’s model didn’t just fail to answer questions; it invented entirely fake articles and attributed them to real publishers. For example, when prompted to “tell me about the unknowability of the undecided voter,” the model might fabricate a New Yorker article that never existed. The court found that this wasn’t just an error; it was a “false designation of origin” under the Lanham Act. By attributing fabricated content to a trusted brand, the AI was damaging the brand’s reputation.
For the legal profession, the implications are catastrophic. We rely on the authority of our sources. If a lawyer uses an AI tool that fabricates a case, a statute or a regulation and attributes it to a real source, like the Supreme Court of Canada or the Department of Justice, that lawyer may not merely be found incompetent. Under this new legal theory, they (and their firm) could be complicit in trademark infringement and “passing off,” opening a whole new world of potential liability for counsel and, indeed, any professional or organization.
We have already seen lawyers sanctioned for submitting hallucinated cases to the court. This ruling opens a new front of liability: the AI provider itself, and potentially the enterprise user who deploys the tool, can be held liable for the reputational damage caused by these fabrications. Reliability is no longer just a performance metric; it is a legal standard. If your AI cannot guarantee the provenance of its output, if it cannot prove that the document it cites actually exists, it is legally unsafe for professional use.
The poisoned tree: The dangers of unknown training data
At the root of these liabilities is the issue of training data: the “black box” itself.
In the Cohere proceedings, the reality of how these models are built was laid bare. Cohere, like many of its competitors, admitted to relying on datasets such as “Common Crawl,” a massive scrape of the open internet that makes no distinction between public-domain works, pirated content and copyrighted material.
This is the “poisoned tree” from which all fruit of the generative AI revolution has grown.
For an enterprise user, this presents an impossible dilemma. You cannot build a compliant justice system, a secure banking infrastructure or a defensible corporate strategy on a foundation of stolen data. When a vendor hides their data sources behind a wall of proprietary secrecy, they are not protecting their intellectual property; they are hiding their liability. And when you sign that licence agreement, you are inheriting the risk of that theft.
This issue is compounded by Canada’s unique position. We have long championed “data sovereignty,” the idea that Canadian data should be subject to Canadian laws. Yet, by adopting models like Cohere’s, which are trained indiscriminately on vast swaths of American and international data, we are importing extraterritorial legal risk.
The Cohere ruling demonstrates that U.S. courts are willing to pierce the veil of these models. If a Canadian bank or government department is using a model trained on infringing American data to generate “substitutive summaries” of American content, are they insulated from U.S. copyright law? It is a gamble no responsible risk manager should be willing to take.
We are witnessing a collision between the “move fast and break things” culture of Silicon Valley (and its Canadian imitators) and the immutable principles of property rights and due process. The courts are signalling that they will not allow the “black box” to serve as a liability shield. If you cannot explain where your AI got its information, you cannot defend its outputs.
The liability shift
The Advance Local Media v. Cohere decision is a watershed moment. It marks the end of the era of “innocent adoption.” We can no longer treat AI procurement as a simple IT decision, focused solely on feature sets and API costs. It is now, fundamentally, a risk management crisis.
For Canadian justice leaders, law firms and corporations, the path forward requires a dramatic pivot. We must stop asking “what can this model do?” and start asking “how does this model know what it knows?”
The era of the “black box” in regulated industries is over. If we are to integrate AI into the administration of justice or the operation of critical enterprise, we must demand explainability and provenance. We need systems that can show their work, that can trace every output back to a verifiable, licensed, and authentic source. We need “citation, not creation.”
The risks exposed by the Cohere ruling (e.g., substitutive infringement, trademark violations via hallucination, and liability for unknown training data) are not theoretical. They are now the subject of active litigation.
If you don’t know what your AI was trained on, you don’t know who you’re stealing from. And as of this month, the courts have made it clear: they are ready to hold you accountable for it.
Daniel J. Escott is a research fellow at the Artificial Intelligence Risk and Regulation Lab and the Access to Justice Centre for Excellence. He is currently pursuing a PhD in AI regulation at Osgoode Hall Law School, and holds an LLM from Osgoode Hall Law School, a JD from the University of New Brunswick and a BBA from the Memorial University of Newfoundland.
The opinions expressed are those of the author(s) and do not necessarily reflect the views of the author’s firm, its clients, Law360 Canada, LexisNexis Canada or any of its or their respective affiliates. This article is for general information purposes and is not intended to be and should not be taken as legal advice.
Interested in writing for us? To learn more about how you can add your voice to Law360 Canada, contact Analysis Editor Peter Carter at peter.carter@lexisnexis.ca or call 647-776-6740.
