Agentic AI in law: Navigating the evolution of artificial intelligence

By Tara Sarvnaz Raissi ·

Law360 Canada (August 6, 2025, 3:03 PM EDT) --
Photo of Tara Sarvnaz Raissi
Tara Sarvnaz Raissi
AI agents are driving the next stage in the evolution of artificial intelligence. Agentic AI is a large language model (LLM) that interacts with external systems such as web browsers, databases, emails and calendars, commonly referred to as “tools.” AI agents use tools to execute tasks to pursue user-defined goals. These systems can make decisions and take autonomous actions (see Díaz, Santiago (Sal), Christoph Kern, and Kara Olive. “Google’s Approach for Secure AI Agents: An Introduction.” Google Research, May 2025). Despite their autonomy, they require ongoing human oversight, as their ability to act independently increases AI-related risks.

While LLMs are the intelligence behind both generative and agentic AI, the former relies on continuous user prompts to generate content and is limited to the data it was trained on. For example, generative AI will create a “to-do” list when prompted, using generic tasks from its training data. In contrast, agentic AI is goal driven by design and has access to external tools to solve problems with minimal human guidance. In the example above, an agentic AI system with access to a user’s calendar and email address will generate a personalized to-do list based on the user’s schedule and communications.

The autonomy of AI agents promises to save time and reduce labour costs in legal workflows. However, its greater independence amplifies long-standing concerns about the reliability, transparency and privacy risks inherent to AI. This article will explore how each of these issues apply to agentic AI and identify considerations professionals should keep in mind when using this technology.

Reliability: Hallucination vs. the illusion of understanding

AI agent_image_350W

Sansert Sangsakawrat: ISTOCKPHOTO.COM

Hallucinations raise valid concerns about the reliability of AI-generated output. An LLM system can generate a factually incorrect response or one that is entirely made up. For example, ChatGPT may produce a non-existent legal case that is later caught by opposing counsel or the court. However, even greater risks arise when AI systems produce errors that stem from the illusion of understanding a concept. In these cases, the model appears to understand a principle but fails to grasp how to use it and applies it incorrectly in its reasoning. For example, the system can accurately describe a specific style of poetry but fail to generate a correct example of it when prompted (see Mancoridis, Marina, Bec Weeks, Keyon Vafa, and Sendhil Mullainathan. “Potemkin Understanding in Large Language Models: New Study Reveals Flaws in AI Benchmarks.” Proceedings of the 42nd International Conference on Machine Learning, PMLR, May 2025). Mistakes of this nature can be subtle and are harder to detect than hallucinations. Identifying them requires a critical eye, subject matter expertise and diligent fact checking.

The autonomy of agentic AI models increases their potential for compounding errors (“Agentic Misalignment: How LLMs could be insider threats.” June 20, 2025, Anthropic). These LLMs begin with a command or an input from a human user. Once the task is understood by the model, the system directs its own processes and selects the appropriate tools to complete it. If the LLM has misunderstood or misapplied a concept at the outset, its inaccurate reasoning will shape every subsequent action, leading to a cascade of errors that will go undetected without a step-by-step review by a human.

Transparency: Risky agentic behaviours

LLMs are difficult to understand or reliably predict. Their “black box” nature is even more pronounced in agentic systems where autonomous decision-making adds layers of unpredictability. For example, agentic models may act against users’ interests to pursue their assigned goals. This phenomenon, termed “agentic misalignment,” was observed during stress-tests on a number of leading AI models in hypothetical corporate environments.

In these tests, models were given harmless business goals and granted the autonomy to send emails and access private information. When faced with scenarios such as the threat of being replaced, some models resorted to harmful actions including leaking data and blackmail (“Agentic Misalignment: How LLMs could be insider threats,” Anthropic).

While evidence of this behaviour has not been found in real-world deployments, researchers warn about the transparency risks posed by autonomous AI models that have access to sensitive information. Real-time monitoring, logging decision-making processes and frequent external audits are a few tools that give stakeholders more insight into the model’s reasoning.

Privacy: The blast radius of a data breach will be greater

LLMs are susceptible to manipulation. They cannot reliably distinguish between trusted user instructions and malicious inputs embedded in prompts or external systems. They will follow any instructions that make it to the model, regardless of the source. Without adequate human oversight, this vulnerability will expose private data.

Agentic AI systems autonomously combine and use tools to pursue user instructions. Many tools allow access to private and sensitive information. Some of the same tools connect to external sources such as websites or third-party platforms. Access to private information while interacting with external sources increases the risk of unintentional data breaches. Combined with the system’s autonomy, even a small manipulation can increase the blast radius of a privacy breach. Robust access controls such as exercising limits on tool permissions that are based on task-specific needs can mitigate this risk.

Conclusion

The ability to interact with external tools and act independently introduces new reliability, transparency and privacy risks. A better understanding of how agentic AI works helps identify its vulnerabilities. To mitigate these risks, continuous human oversight is critical. This includes clear boundaries for what agentic systems can and cannot do, mechanisms for human intervention where appropriate, exercising limits on the data and access available to these systems, and regular audits of their actions.

Tara Sarvnaz Raissi (CIPP/C) is senior legal counsel (Ontario, Western and Atlantic Canada) at Beneva and serves on the board of directors of the Toronto Lawyers Association. She has written extensively about AI use in legal practice.

The opinions expressed are those of the author and do not reflect the views of the author’s firm, its clients, Law360 Canada, LexisNexis Canada or any of its or their respective affiliates. This article is for general information purposes and is not intended to be and should not be taken as legal advice.

Interested in writing for us? To learn more about how you can add your voice to Law360 Canada, contact Analysis Editor Yvette Trancoso at Yvette.Trancoso-barrett@lexisnexis.ca or call 905-415-5811.