Incident Response for AI Systems
AI-Specific IR Considerations
Traditional incident response frameworks (NIST SP 800-61, SANS) apply, but AI incidents have unique characteristics:
- Attribution is harder. A prompt injection attack looks like a normal user query.
- Blast radius is unclear. If a model is compromised via poisoning, every output since the last known-good checkpoint is suspect.
- Evidence is ephemeral. Conversation logs may not capture the full context. Model state isn't easily snapshot-able.
- Remediation is slow. You can't patch a model the way you patch software. Retraining takes weeks and costs millions.
AI Incident Categories
| Category | Example | Severity |
|---|---|---|
| Data leakage via AI | Model outputs PII, credentials, or proprietary data | Critical |
| Prompt injection in production | Attacker hijacks AI assistant behavior | High |
| Model compromise | Poisoned model deployed, backdoor activated | Critical |
| Shadow AI data exposure | Employee uploads sensitive data to unauthorized AI tool | High |
| Hallucination with impact | AI provides false information leading to business decision | Medium-High |
| AI-powered social engineering | Deepfake or AI-generated phishing targeting employees | High |
| API abuse / extraction | Anomalous query patterns indicating model theft | Medium |
Response Playbook
Immediate (0-4 hours)
- Confirm the incident — is this a real AI-specific issue or a traditional security incident?
- Contain — disable the affected AI endpoint, revoke API keys, block the source
- Preserve evidence — export conversation logs, model version, system prompt, RAG state
- Notify stakeholders — CISO, legal, privacy team, affected business owners
Short-term (4-48 hours)
- Determine scope — how many users affected? What data exposed?
- Root cause analysis — was it injection, poisoning, misconfiguration, or insider?
- Remediate — patch system prompt, update filters, rollback model if needed
- Communicate — internal notification, customer notification if data exposed
Long-term (1-4 weeks)
- Post-incident review — what failed and why?
- Update controls — new filters, monitoring rules, access restrictions
- Red team validation — test that the fix actually works
- Policy updates — revise AI governance based on lessons learned
- Regulatory reporting — if required (GDPR breach notification, etc.)
Tabletop Exercise Scenarios
Run these quarterly with your IR team:
- Scenario: Customer reports the chatbot revealed another customer's account details
- Scenario: Security researcher publishes a blog post with your extracted system prompt and API keys
- Scenario: Internal monitoring detects a fine-tuned model was deployed with a backdoor
- Scenario: An employee's AI-generated phishing email compromises a VIP target
- Scenario: Your AI vendor (OpenAI/Anthropic) reports a data breach affecting your API usage