CIA Triad Applied to AI

Overview

The CIA triad — Confidentiality, Integrity, Availability — remains the foundation for AI security, but each dimension has AI-specific concerns that traditional controls don't cover.

Confidentiality

What it means for AI: Preventing unauthorized disclosure of sensitive information through or from AI systems.

AI-specific threats:

  • Training data extraction — model memorizes and leaks PII, credentials, proprietary data
  • System prompt leakage — hidden instructions revealed to users
  • Conversation data exposure — multi-tenant systems leaking between users
  • Embedding inversion — reconstructing text from vector representations
  • Model weight theft — exfiltrating the model itself (contains training data implicitly)

→ Deep dive: Confidentiality — Data Leakage & Privacy

Integrity

What it means for AI: Ensuring AI outputs are accurate, unmanipulated, and trustworthy.

AI-specific threats:

  • Data poisoning — corrupted training data leads to corrupted behavior
  • Prompt injection — attacker manipulates model outputs in real time
  • Hallucination — model generates plausible but false information
  • Backdoors — hidden triggers cause specific targeted misbehavior
  • Model tampering — unauthorized modification of weights or configuration

→ Deep dive: Integrity — Poisoning, Manipulation & Hallucination

Availability

What it means for AI: Ensuring AI systems remain operational and performant.

AI-specific threats:

  • Model denial of service — crafted inputs that cause high compute cost
  • API rate limit exhaustion — legitimate-looking queries consuming all capacity
  • Model drift — gradual performance degradation without explicit attack
  • Dependency failure — third-party model API goes down
  • Compute resource exhaustion — GPU memory attacks, context window stuffing

→ Deep dive: Availability — Denial of Service & Model Reliability

Controls Summary

CIA PillarKey Controls
ConfidentialityOutput filtering, PII detection, differential privacy, access control, DLP for AI
IntegrityInput validation, data provenance, output verification, human-in-the-loop, monitoring
AvailabilityRate limiting, circuit breakers, model redundancy, fallback systems, load balancing