Failover & Fallback Strategies

Why AI Systems Need Fallbacks

AI systems can fail in ways traditional software doesn't — hallucinating confidently, degrading gradually, or becoming adversarially compromised without obvious errors. Fallbacks ensure business continuity.

Fallback Architecture

Tier 1: Model Fallback

Primary model fails → route to a secondary model.

Primary	Fallback	Trade-off
GPT-4o	Claude 3.5 Sonnet	Different vendor, similar capability
Claude 3.5 Sonnet	Llama 3 70B (self-hosted)	No vendor dependency, lower quality
Custom fine-tune	Base model without fine-tuning	Loses specialization, maintains function

Tier 2: Degraded Service

All models unavailable → serve reduced functionality.

Return cached responses for common queries
Route to rule-based system (decision tree, keyword matching)
Display "AI unavailable" with human escalation option

Tier 3: Human Fallback

AI system compromised or unreliable → route to humans.

Live chat agents handle queries directly
Queue system with SLA for response time
Automated triage routes to appropriate human team

Implementation Patterns

Circuit Breaker

Monitor error rate → if rate > threshold for N seconds:
  → Open circuit (stop sending to primary)
  → Route all traffic to fallback
  → After cooldown period, test primary with canary request
  → If canary succeeds, close circuit (resume primary)

Confidence Gating

Model produces response with confidence score
  → If confidence > threshold: return response
  → If confidence < threshold: flag for human review
  → If confidence < critical threshold: route to fallback

Cost-Based Circuit Breaker

Track API spend per hour
  → If spend > 2x normal: alert
  → If spend > 5x normal: switch to cheaper fallback model
  → If spend > 10x normal: suspend AI service, route to humans

AI Security Book