OWASP LLM Top 10

Overview

The OWASP Top 10 for LLM Applications is the standard vulnerability taxonomy for AI application security. Version 2.0 (2025) covers:

Attacker manipulates model behavior by injecting instructions through direct input or via untrusted data sources the model processes.

Impact: Unauthorized actions, data leakage, system prompt bypass Cross-reference: Prompt Injection

The model reveals confidential information through its responses — training data, system prompts, PII, API keys, or proprietary business logic.

Impact: Privacy violation, credential exposure, IP leakage Cross-reference: Training Data Extraction, System Prompt Extraction

Compromised models, poisoned training data, vulnerable plugins, or malicious third-party components in the AI stack.

Impact: Backdoored behavior, malicious code execution, data theft Cross-reference: Supply Chain Attacks

Manipulation of training, fine-tuning, or embedding data to introduce vulnerabilities, backdoors, or biases into the model.

Impact: Compromised model integrity, targeted misclassification, hidden triggers Cross-reference: Data Poisoning & Backdoors

Application fails to validate, sanitize, or safely handle model outputs before passing them to downstream systems (databases, browsers, APIs).

Impact: XSS, SSRF, privilege escalation, remote code execution via model-generated payloads

Model is granted too many capabilities, permissions, or autonomy. Combines with prompt injection for maximum impact.

Impact: Unauthorized API calls, data modification, financial transactions Cross-reference: RAG & Agentic Systems

Attacker extracts the system prompt, revealing hidden instructions, business logic, safety rules, API keys, or persona definitions.

Impact: Attack surface exposure, credential theft, bypass roadmap Cross-reference: System Prompt Extraction

Exploitation of vulnerabilities in RAG pipelines — poisoned embeddings, retrieval manipulation, or unauthorized access to vector stores.

Impact: Information manipulation, unauthorized data access, injection via retrieved content

Model generates false or misleading content that appears authoritative — hallucinations presented as fact.

Impact: Reputational damage, legal liability, bad business decisions

Resource exhaustion attacks — crafted inputs that consume excessive compute, memory, or API credits.

Impact: Denial of service, financial damage from runaway API costs