Confidentiality — Data Leakage & Privacy
AI-Specific Confidentiality Threats
Training Data Leakage
Models memorize and can reproduce training data. This includes PII (names, emails, phone numbers, addresses), credentials (API keys, passwords in code), proprietary content (internal documents, trade secrets), and copyrighted material.
Risk level: High for any model trained on internal data or fine-tuned on proprietary datasets.
System Prompt Exposure
System prompts often contain business logic, API keys, internal URLs, persona instructions, and security rules. Extraction gives attackers a blueprint of the application.
Conversation Data Exposure
Multi-tenant AI systems — where multiple users share the same model deployment — may leak data between users through shared context, caching, or logging failures.
Shadow AI Data Leakage
Employees paste sensitive data into unauthorized AI tools. This is the most common AI confidentiality risk in enterprises today.
| Data Type | Risk Example |
|---|---|
| Source code | Developer pastes proprietary code into ChatGPT for debugging |
| Customer data | Support rep pastes customer PII into AI for email drafting |
| Financial data | Analyst uploads earnings data to AI for summarization |
| Legal documents | Attorney pastes contracts into AI for review |
| HR records | HR uploads employee reviews for AI-assisted feedback |
Embedding Inversion
RAG systems store document embeddings in vector databases. Research has shown embeddings can be inverted to approximately reconstruct the original text — meaning the vector database itself is a data leakage risk.
Controls
| Control | Implementation | Effectiveness |
|---|---|---|
| Output DLP | Scan model outputs for PII patterns (SSN, CC, email) before returning to user | Medium — catches known patterns, misses novel ones |
| Input DLP | Scan user inputs and block sensitive data from reaching the model | Medium-High — prevents data exposure to third-party models |
| AI acceptable use policy | Define what data can and cannot be shared with AI tools | Foundational — requires training and enforcement |
| CASB integration | Monitor and control employee access to cloud AI services | High — provides visibility into shadow AI |
| Data classification gates | Only allow models to access data at or below their classification level | High — prevents classification boundary violations |
| Differential privacy | Add mathematical noise during training to prevent memorization | High effectiveness but degrades model quality |
| Endpoint controls | Block or monitor clipboard copy to AI web applications | Medium — can be circumvented |
| Audit logging | Log all interactions with AI systems for forensic review | Detective only — doesn't prevent but enables response |
| Token-level filtering | Strip or mask PII from model context before processing | Medium-High — requires robust PII detection |
Metrics
- Number of shadow AI tools detected per month
- PII detection rate in model outputs
- Percentage of AI interactions covered by DLP
- Mean time to detect data leakage incidents
- Employee completion rate for AI acceptable use training