Incident Response for AI Systems

AI-Specific IR Considerations

Traditional incident response frameworks (NIST SP 800-61, SANS) apply, but AI incidents have unique characteristics:

Attribution is harder. A prompt injection attack looks like a normal user query.
Blast radius is unclear. If a model is compromised via poisoning, every output since the last known-good checkpoint is suspect.
Evidence is ephemeral. Conversation logs may not capture the full context. Model state isn't easily snapshot-able.
Remediation is slow. You can't patch a model the way you patch software. Retraining takes weeks and costs millions.

Category	Example	Severity
Data leakage via AI	Model outputs PII, credentials, or proprietary data	Critical
Prompt injection in production	Attacker hijacks AI assistant behavior	High
Model compromise	Poisoned model deployed, backdoor activated	Critical
Shadow AI data exposure	Employee uploads sensitive data to unauthorized AI tool	High
Hallucination with impact	AI provides false information leading to business decision	Medium-High
AI-powered social engineering	Deepfake or AI-generated phishing targeting employees	High
API abuse / extraction	Anomalous query patterns indicating model theft	Medium

Confirm the incident — is this a real AI-specific issue or a traditional security incident?
Contain — disable the affected AI endpoint, revoke API keys, block the source
Preserve evidence — export conversation logs, model version, system prompt, RAG state
Notify stakeholders — CISO, legal, privacy team, affected business owners

Determine scope — how many users affected? What data exposed?
Root cause analysis — was it injection, poisoning, misconfiguration, or insider?
Remediate — patch system prompt, update filters, rollback model if needed
Communicate — internal notification, customer notification if data exposed

Run these quarterly with your IR team:

Scenario: Customer reports the chatbot revealed another customer's account details
Scenario: Security researcher publishes a blog post with your extracted system prompt and API keys
Scenario: Internal monitoring detects a fine-tuned model was deployed with a backdoor
Scenario: An employee's AI-generated phishing email compromises a VIP target
Scenario: Your AI vendor (OpenAI/Anthropic) reports a data breach affecting your API usage