Testing & Exploitation

Test Execution Framework

Run through extraction techniques in order of sophistication. Document the full extracted prompt.

Systematic testing against content restrictions:

Test every data input channel for injection:

Channel	Test Method
Direct user input	Type injection payloads directly
RAG documents	Upload documents containing injection
Web content	If AI browses, test with a controlled page containing injection
Tool outputs	If tools are available, test if tool output can contain injection
File uploads	Embed instructions in uploaded files (PDFs, images with EXIF data)

Prove real-world consequences:

Data exfiltration: Can the model leak system prompt, user data, or knowledge base content?
Unauthorized actions: Can you trigger tool calls the user didn't request?
Cross-user contamination: Can you affect other users' sessions?
Persistence: Can you modify the knowledge base or system behavior persistently?

Record everything: