CIA Triad Applied to AI
Overview
The CIA triad — Confidentiality, Integrity, Availability — remains the foundation for AI security, but each dimension has AI-specific concerns that traditional controls don't cover.
Confidentiality
What it means for AI: Preventing unauthorized disclosure of sensitive information through or from AI systems.
AI-specific threats:
- Training data extraction — model memorizes and leaks PII, credentials, proprietary data
- System prompt leakage — hidden instructions revealed to users
- Conversation data exposure — multi-tenant systems leaking between users
- Embedding inversion — reconstructing text from vector representations
- Model weight theft — exfiltrating the model itself (contains training data implicitly)
→ Deep dive: Confidentiality — Data Leakage & Privacy
Integrity
What it means for AI: Ensuring AI outputs are accurate, unmanipulated, and trustworthy.
AI-specific threats:
- Data poisoning — corrupted training data leads to corrupted behavior
- Prompt injection — attacker manipulates model outputs in real time
- Hallucination — model generates plausible but false information
- Backdoors — hidden triggers cause specific targeted misbehavior
- Model tampering — unauthorized modification of weights or configuration
→ Deep dive: Integrity — Poisoning, Manipulation & Hallucination
Availability
What it means for AI: Ensuring AI systems remain operational and performant.
AI-specific threats:
- Model denial of service — crafted inputs that cause high compute cost
- API rate limit exhaustion — legitimate-looking queries consuming all capacity
- Model drift — gradual performance degradation without explicit attack
- Dependency failure — third-party model API goes down
- Compute resource exhaustion — GPU memory attacks, context window stuffing
→ Deep dive: Availability — Denial of Service & Model Reliability
Controls Summary
| CIA Pillar | Key Controls |
|---|---|
| Confidentiality | Output filtering, PII detection, differential privacy, access control, DLP for AI |
| Integrity | Input validation, data provenance, output verification, human-in-the-loop, monitoring |
| Availability | Rate limiting, circuit breakers, model redundancy, fallback systems, load balancing |