Confidentiality — Data Leakage & Privacy

AI-Specific Confidentiality Threats

Training Data Leakage

Models memorize and can reproduce training data. This includes PII (names, emails, phone numbers, addresses), credentials (API keys, passwords in code), proprietary content (internal documents, trade secrets), and copyrighted material.

Risk level: High for any model trained on internal data or fine-tuned on proprietary datasets.

System Prompt Exposure

System prompts often contain business logic, API keys, internal URLs, persona instructions, and security rules. Extraction gives attackers a blueprint of the application.

Conversation Data Exposure

Multi-tenant AI systems — where multiple users share the same model deployment — may leak data between users through shared context, caching, or logging failures.

Shadow AI Data Leakage

Employees paste sensitive data into unauthorized AI tools. This is the most common AI confidentiality risk in enterprises today.

Data TypeRisk Example
Source codeDeveloper pastes proprietary code into ChatGPT for debugging
Customer dataSupport rep pastes customer PII into AI for email drafting
Financial dataAnalyst uploads earnings data to AI for summarization
Legal documentsAttorney pastes contracts into AI for review
HR recordsHR uploads employee reviews for AI-assisted feedback

Embedding Inversion

RAG systems store document embeddings in vector databases. Research has shown embeddings can be inverted to approximately reconstruct the original text — meaning the vector database itself is a data leakage risk.

Controls

ControlImplementationEffectiveness
Output DLPScan model outputs for PII patterns (SSN, CC, email) before returning to userMedium — catches known patterns, misses novel ones
Input DLPScan user inputs and block sensitive data from reaching the modelMedium-High — prevents data exposure to third-party models
AI acceptable use policyDefine what data can and cannot be shared with AI toolsFoundational — requires training and enforcement
CASB integrationMonitor and control employee access to cloud AI servicesHigh — provides visibility into shadow AI
Data classification gatesOnly allow models to access data at or below their classification levelHigh — prevents classification boundary violations
Differential privacyAdd mathematical noise during training to prevent memorizationHigh effectiveness but degrades model quality
Endpoint controlsBlock or monitor clipboard copy to AI web applicationsMedium — can be circumvented
Audit loggingLog all interactions with AI systems for forensic reviewDetective only — doesn't prevent but enables response
Token-level filteringStrip or mask PII from model context before processingMedium-High — requires robust PII detection

Metrics

  • Number of shadow AI tools detected per month
  • PII detection rate in model outputs
  • Percentage of AI interactions covered by DLP
  • Mean time to detect data leakage incidents
  • Employee completion rate for AI acceptable use training