AI Audit Checklist

Purpose

A pre-deployment audit checklist for AI systems. Use this before promoting any AI feature, model, or integration to production. Adapt the scope based on the system's risk tier.

Risk Tiering

Determine the audit depth based on system risk:

TierCriteriaAudit Depth
CriticalAffects financial decisions, medical outcomes, legal determinations, or critical infrastructureFull checklist — every item
HighProcesses PII, makes automated decisions about people, or has tool-use capabilitiesFull checklist minus physical security items
MediumInternal-facing, no PII, human-in-the-loop for all decisionsCore sections only (governance, data, security, monitoring)
LowNon-sensitive internal tool, no decision-making authorityGovernance and security sections only

1. Governance & Documentation

□ AI system registered in the organizational AI inventory
□ System owner and accountable executive identified
□ Risk tier classification completed and documented
□ Intended use case documented with clear boundaries
□ Out-of-scope uses explicitly listed
□ Data Processing Impact Assessment (DPIA) completed if PII involved
□ AI Acceptable Use Policy compliance confirmed
□ Regulatory requirements mapped (EU AI Act tier, state laws, sector rules)
□ Third-party agreements reviewed (DPA, ToS, SLA)
□ Change management process defined for model updates

2. Data Governance

□ Training data sources documented with provenance
□ Training data scanned for PII — results documented
□ PII handling compliant with privacy policy and applicable regulations
□ Data consent basis verified for AI training use
□ Data deduplication applied to reduce memorization risk
□ Data quality assessment completed
□ Bias assessment on training data completed
□ Data retention and deletion procedures defined
□ RAG knowledge base contents reviewed and approved
□ Vector database access controls configured

3. Model Security

□ Model artifact integrity verified (hash check against source)
□ Model format is safe (safetensors preferred over pickle)
□ Model provenance documented (source, version, modifications)
□ System prompt reviewed by security team
□ No credentials, API keys, or internal URLs in system prompt
□ Tool permissions scoped to minimum necessary
□ Model access controls configured (who can query, who can modify)
□ Model version pinned (not auto-updating without review)
□ Fine-tuning data reviewed for poisoning indicators
□ Model weight storage encrypted with access logging

4. Security Testing

□ Prompt injection testing completed
  □ Direct injection attempts
  □ Indirect injection via all data input channels
  □ System prompt extraction attempts
□ Jailbreak testing completed
  □ Role-play and persona attacks
  □ Encoding and obfuscation bypasses
  □ Multi-turn escalation attempts
□ Data leakage testing completed
  □ PII extraction attempts
  □ Training data extraction probes
  □ Cross-user data isolation verified
□ Tool abuse testing completed (if applicable)
  □ Unauthorized API calls via injection
  □ Data exfiltration via tool use
  □ Privilege escalation through tool chaining
□ Denial of service testing
  □ Context window stuffing
  □ Rate limit validation
  □ Timeout enforcement verification
□ All findings documented with severity ratings
□ Critical and high findings remediated before deployment
□ Accepted risks documented with compensating controls

5. Input/Output Controls

□ Input length limits configured
□ Input content filtering active (injection detection)
□ PII detection active on inputs (redaction or blocking)
□ Output PII scanning active
□ Output content safety classification active
□ System prompt leakage detection active
□ Response length limits configured
□ Confidence thresholds defined for human escalation
□ Hallucination mitigation in place (RAG grounding, disclaimers)
□ Error handling returns safe fallback responses (no stack traces or model internals)

6. Access Control

□ Authentication required for all AI endpoints
□ Authorization enforced — users only access appropriate AI capabilities
□ API keys scoped with minimum necessary permissions
□ Rate limiting configured per user, per key, and per IP
□ Admin access to model configuration requires MFA
□ System prompt modifications go through change management
□ API key rotation schedule defined
□ Service account permissions follow least privilege

7. Monitoring & Observability

□ Request/response logging active (with PII redaction)
□ Performance metrics monitored (latency, error rate, throughput)
□ Cost monitoring and alerting configured
□ Anomaly detection on query patterns (extraction indicators)
□ Drift monitoring baseline established
□ Safety metric monitoring active (toxicity, refusal rate, PII in outputs)
□ Alerting thresholds defined and tested
□ Dashboard accessible to security and operations teams
□ Log retention period defined and compliant with policy

8. Resilience & Incident Response

□ Fallback path tested — what happens when AI is unavailable?
□ Circuit breaker configured and tested
□ Model rollback procedure documented and tested
□ Incident response playbook includes AI-specific scenarios
□ Escalation path defined for AI security incidents
□ Kill switch available to disable AI features immediately
□ Backup model or degraded service mode tested
□ Recovery time objective (RTO) defined for AI service restoration

9. Bias & Fairness (for systems affecting people)

□ Protected attributes identified for the use case
□ Disaggregated evaluation completed across demographic groups
□ Fairness metrics selected and evaluated
□ Intersectional analysis completed
□ Identified biases documented with mitigation steps
□ Ongoing bias monitoring plan established
□ Bias audit schedule defined (annual minimum for regulated uses)
□ AI disclosure requirements met (inform users they're interacting with AI)
□ Applicable regulations identified and requirements mapped
□ Explainability requirements met for the risk tier
□ Record-keeping requirements satisfied
□ Adverse action notice procedures defined (if applicable — lending, hiring)
□ IP review completed — AI outputs don't infringe on copyrighted content
□ Insurance coverage reviewed for AI-related liability
□ Regulatory filing requirements identified and scheduled

Sign-Off

RoleNameDateApproval
System Owner□ Approved
Security Lead□ Approved
Privacy/Legal□ Approved
ML Engineering□ Approved
Business Owner□ Approved
CISO (Critical/High tier only)□ Approved

Post-Deployment Review Schedule

ReviewFrequencyOwner
Performance metrics reviewWeeklyML Engineering
Security monitoring reviewWeeklySecurity Operations
Drift assessmentMonthlyML Engineering
Bias auditQuarterly / AnnuallyAI Governance
Full re-auditAnnually or on major model changeCross-functional
Red team assessmentAnnually minimumSecurity / Red Team