AI & Cybersecurity / AI Security / OWASP LLM Top 10

AI Security

OWASP LLM Top 10

AI Security

The ten most critical security risks for applications built on large language models, per the OWASP LLM Top 10 project.

When organizations started embedding large language models into production applications at scale in 2023, the security community needed a structured vocabulary for the risks that were emerging. The OWASP Foundation, already the standard reference for web application security risks, assembled a community working group to produce an equivalent framework for LLM applications. The result, the OWASP Top 10 for LLM Applications, has been updated twice since its initial release, most recently in late 2024 to cover the rapid growth of agentic AI and new attack techniques documented in the wild. It is not a comprehensive attack taxonomy, but it is the most widely adopted framework for communicating about LLM security risks across security teams, developers, and business stakeholders, and understanding it is now a baseline expectation for anyone working at the intersection of AI and security.

What you'll learn

Key takeaways from this topic.
  • Identify each of the ten risk categories in OWASP's 2025 LLM Top 10 and explain the core vulnerability each addresses.
  • Explain how the LLM Top 10 differs from the traditional OWASP Top 10 for web applications and why a separate list is necessary.
  • Use the framework as a structured baseline for assessing and communicating LLM application security risk.

At a glance

Fast mental model before you dive in.
Core concepts
  • LLM-specific attack surface
  • Agentic risk categories
  • Supply chain and training risks
Techniques
  • Prompt injection (LLM01)
  • Excessive agency exploitation
  • RAG and vector poisoning
Defenses
  • Defense in depth across all 10 categories
  • Governance and AI-BOM
  • Least privilege for agents

Core idea

The OWASP LLM Top 10 exists because LLM security risks do not map cleanly onto traditional application security categories. A language model that hallucinates facts is not exhibiting an injection vulnerability or a broken access control issue, but it can still cause significant harm when users rely on its output for decisions. A model that can be instructed through malicious content in a document it processes is experiencing something conceptually related to injection, but the mechanism and the remediation are fundamentally different from SQL injection. The LLM Top 10 provides categories built around how language models actually fail, rather than adapting existing categories that were built for different types of systems.

The 2025 edition reflects the growth of agentic AI specifically. When models only produced text, most risks were about what the model said. When models can also take actions, browse the web, write and execute code, send emails, and modify files, the risk profile shifts substantially. The 2025 Top 10 added or substantially expanded several categories to address this, including Excessive Agency, System Prompt Leakage, and Vector and Embedding Weaknesses.

The framework is a communication tool as much as a technical reference. Security teams use it to ensure that LLM application reviews cover the right ground. Developers use it to understand what kinds of problems to design against. Business stakeholders use it to understand what can go wrong when their organization deploys AI applications. Learning the ten categories well enough to explain each one and its key mitigation is the baseline competence the framework is designed to build.

The ten categories

LLM01: Prompt Injection holds the top position for good reason and has not moved since the list's inception. It exploits the LLM's inability to reliably separate instructions from data, allowing attackers to override intended behavior either through direct manipulation or by planting instructions in content the model processes. Covered in depth in the Prompt Injection topic.

LLM02: Sensitive Information Disclosure rose to second place in the 2025 edition, from sixth. LLMs can memorize and later reproduce fragments of their training data, including PII, proprietary business data, confidential documents, and credentials. Attackers have demonstrated techniques for extracting memorized content through targeted queries. Models can also be prompted to reveal their system prompt, which may contain sensitive configuration or operational logic.

LLM03: Supply Chain addresses the complex dependency chain that surrounds modern LLM applications: foundation models from external providers, fine-tuning datasets from third parties, plugins and integrations, RAG data sources, and inference APIs. Each is an opportunity for compromise. Covered in detail in the Data and Model Poisoning topic.

LLM04: Data and Model Poisoning specifically addresses manipulation of training data, fine-tuning data, and retrieval content. Attackers who can influence these datasets can cause models to produce biased outputs, degrade accuracy on targeted topics, or exhibit backdoor behaviors triggered by specific inputs that are invisible under standard evaluation. Covered in depth in the Data and Model Poisoning topic.

LLM05: Improper Output Handling covers insufficient validation, sanitization, and handling of LLM-generated content before it is passed to downstream systems. When model outputs are used in contexts that interpret them as instructions, code, or structured data, unsanitized outputs can introduce injection vulnerabilities in those downstream systems. A model that generates SQL and passes it to a database without sanitization can be the source of SQL injection even though the injection occurred at the LLM output level.

LLM06: Excessive Agency is one of the most significantly expanded entries in the 2025 edition. When LLM agents have access to more tools than their task requires, broader permissions than necessary, or the ability to take high-impact actions without human review, they become a larger attack surface. OWASP identifies three root causes: excessive functionality, excessive permissions, and excessive autonomy. The mitigation is least-privilege design for agents.

LLM07: System Prompt Leakage is a new category in 2025. System prompts increasingly contain sensitive instructions, credentials, business logic, and security controls. When these prompts are extracted through adversarial querying, attackers gain insight into the application's configuration, its security mechanisms, and the specific instructions it is trying to follow.

LLM08: Vector and Embedding Weaknesses is also new in 2025, reflecting the widespread adoption of RAG architectures. Attackers can poison vector databases by injecting malicious content that gets retrieved during legitimate queries. Insufficient access controls on vector stores can expose sensitive data across tenant boundaries.

LLM09: Misinformation (renamed from "Overreliance" in previous editions) addresses the risk that LLMs generate and confidently assert false information. Models hallucinate facts, fabricate citations, and produce polished responses to questions they cannot reliably answer. The risk is not just that users trust the output too much; it is that the model itself generates and propagates false information that can mislead critical decisions in legal, medical, financial, and security contexts.

LLM10: Unbounded Consumption addresses uncontrolled resource consumption by LLM applications. Unlike traditional denial-of-service attacks that saturate network capacity, LLM resource abuse involves triggering computationally expensive inference operations, prompt chains that consume excessive tokens, or agent loops that make repeated external API calls.

Real-world impact

The framework's value is most visible when organizations use it proactively. Security teams that review LLM applications against the Top 10 before deployment catch risks that standard application security testing does not surface. Organizations that treat the framework as a checklist after incidents usually discover that they were exposed to a Top 10 risk they had not considered.

The 2025 IBM Cost of a Data Breach Report found that 13% of organizations had experienced a security breach related to an AI tool, and 97% of those organizations lacked proper AI access controls. The Excessive Agency category addresses precisely this gap. The widespread sensitive information disclosure incidents documented through 2024, including cases where models revealed system prompts, training data, and confidential user information from other sessions, correspond directly to LLM02 and LLM07.

Warning signs

Patterns worth investigating further.
  • An LLM application that has tool access or external integrations has not been reviewed for Excessive Agency: the tools available to the model exceed what its stated task requires.
  • Model-generated outputs that will be passed to databases, code interpreters, or other downstream systems are not validated or sanitized before being passed.
  • The organization has no AI Bill of Materials documenting the foundation models, datasets, plugins, and external services used in its LLM applications.

DEEP DIVE

Why LLM risks need their own framework

The traditional OWASP Top 10 for web applications addresses risks like injection, broken access control, cryptographic failures, and security misconfiguration. These categories have been refined over two decades of documented web application vulnerabilities and capture the risks that characterize systems built from code that parses structured inputs and produces structured outputs.

Language models operate differently. They process natural language and produce natural language. Their security properties are probabilistic rather than deterministic: the same input can produce different outputs on different runs, which makes traditional input-output security analysis difficult. Their failure modes include generating false information, reproducing memorized private data, and following instructions that conflict with their intended purpose, none of which map to conventional injection or access control categories.

The LLM Top 10 is not a replacement for the traditional Top 10. An LLM application still runs on infrastructure, uses databases and APIs, and requires the same authentication, authorization, and cryptographic protections as any other web application. The LLM Top 10 addresses the additional risk surface that arises specifically from the language model component. Both frameworks apply to production LLM applications; neither is sufficient alone.

Agentic AI and the expanded risk surface

The 2025 edition of the OWASP LLM Top 10 reflects a qualitative shift in how LLMs are deployed. When models only produced text, compromising them primarily affected the quality and trustworthiness of that text. When models act, browsing the web, executing code, calling APIs, sending emails, modifying databases, compromising them causes real-world effects that are much harder to reverse.

Excessive Agency is the category that directly addresses this. An agentic system designed with least-privilege principles has a fundamentally different risk profile from one where the model can access any tool and take any action. An agent that can only read from a database cannot leak records through tool calls. An agent that cannot send email cannot be manipulated into sending phishing messages. An agent that requires human confirmation for irreversible actions cannot autonomously execute those actions even if its prompt is successfully injected.

The System Prompt Leakage category is also more significant in agentic contexts. System prompts increasingly contain not just behavioral instructions but credentials, connection strings, business logic, and the specific names and descriptions of available tools. An attacker who extracts this information knows the exact architecture of the system they are attacking: what tools exist, what permissions they have, and what instructions the model is operating under.

Vector and Embedding Weaknesses emerged from the widespread adoption of RAG architectures in agentic systems. When an agent retrieves context from a knowledge base to inform its actions, the trustworthiness of that knowledge base is a security property. A poisoned knowledge base does not just produce misleading text; it can influence the actions an agent takes based on that text, with all the real-world consequences that follow.

Governance and the AI-BOM

The Supply Chain category (LLM03) raises a governance requirement that goes beyond technical controls. Organizations deploying LLM applications need visibility into every component of the AI pipeline: which foundation models they use, who trained them, what data they were trained on, which plugins and integrations are connected, which external APIs and services are called, and which datasets augment the model at runtime.

An AI Bill of Materials (AI-BOM) formalizes this visibility. Analogous to the software SBOM that inventories software components and their dependencies, the AI-BOM inventories AI-specific assets: model checksums and versions, training data sources, fine-tuning dataset provenance, plugin and integration manifests, and API endpoint inventory. This documentation enables incident response, supply chain vetting, and regulatory compliance, as AI governance frameworks in the EU AI Act and NIST SP 800-218A increasingly require organizations to document and govern their AI assets.

Maintaining an AI-BOM is an ongoing operational task rather than a one-time documentation exercise. Models are updated, plugins are added, data sources change, and integrations evolve. Organizations that treat the AI-BOM as a living inventory maintained alongside their application's development lifecycle are positioned to respond to AI-specific security incidents with the same efficiency as software supply chain incidents.

Using the Top 10 for application review

The practical application of the OWASP LLM Top 10 is as a structured review checklist for LLM-powered applications before and after deployment. A review against the framework considers each category in turn: For Prompt Injection — does the application pass untrusted external content to the model, and are the model's actions minimized to what is strictly necessary? For Sensitive Information Disclosure — has the system prompt been audited for sensitive information, and has the application been tested with queries designed to extract training data? For Supply Chain — is there an inventory of all model providers, plugins, and data sources, with versions pinned and checksums verified?

For Data and Model Poisoning — are fine-tuning datasets from verified, controlled sources, and is the RAG knowledge base access-controlled and integrity-monitored? For Improper Output Handling — are model outputs that will be passed to downstream systems sanitized? For Excessive Agency — do agents have only the minimum tool access their task requires, and do high-impact actions require human confirmation? For System Prompt Leakage — does the system prompt contain credentials or sensitive business logic that should be stored elsewhere? For Vector and Embedding Weaknesses — who can write to the vector database, and is the knowledge base integrity monitored? For Misinformation — are there human-in-the-loop checkpoints for decisions that rely on model-generated factual claims? For Unbounded Consumption — are there rate limits on inference calls, and are agent loops bounded?

This structured review does not catch every possible vulnerability, but it ensures that the most significant LLM-specific risk categories have been considered and addressed before users are exposed.