The AI agent you're supposed to break.
DVAA is an intentionally vulnerable platform for learning AI agent security, red teaming, and validating security tools. 14 agents across three protocol tiers, 12 vulnerability categories, and 22 capture the flag challenges. Run it locally. Break it. Then break it again.
docker run opena2a/dvaa:0.8.0Dashboard on :9000. Agents on :7001 to :7021. Full port map below.
A safe place to break things.
DVAA is the equivalent of DVWA for AI agents. A deliberately insecure platform for security professionals, researchers, and developers to practice attacking and defending AI agent systems in a safe, legal environment.
Practice attacks safely
Run prompt injection, jailbreak, and exfiltration techniques against intentionally weak agents. No production systems involved.
Validate security tools
Use DVAA as a known vulnerable target to verify HackMyAgent, OASB benchmark runners, and your own scanners produce expected findings.
Study defense in depth
Compare LegacyBot against SecureBot side by side to see which controls block which attack classes.
Capture flags
Work through 22 challenges across four difficulty levels. Each challenge has a specific objective, target agent, and flag to capture.
Attack classes mapped to OASB.
Eight categories of AI agent vulnerabilities, each cross referenced with the Open Agent Security Benchmark. Use them as a study path or as a checklist when validating your own scanner.
Prompt Injection
OASB 3.1Inject instructions into agent prompts to override behavior, extract system prompts, or bypass safety filters.
Jailbreak
OASB 3.3Bypass alignment and safety constraints to make agents perform restricted actions or reveal hidden instructions.
Data Exfiltration
OASB 4.3Extract sensitive data from agent memory, RAG stores, or connected databases through indirect channels.
Capability Abuse
OASB 2.2Exploit legitimate agent capabilities beyond intended scope. File access, code execution, or API calls.
Context Manipulation
OASB 8.1Poison or manipulate the context window to alter agent reasoning, inject false data, or cause hallucinations.
MCP Exploitation
OASB 2.3Attack Model Context Protocol servers. Tool poisoning, schema injection, and cross server escalation.
A2A Attacks
OASB 1.4Exploit agent to agent communication. Identity spoofing, message tampering, and delegation chain abuse.
Supply Chain
OASB 6.1Compromise agent dependencies. Malicious tools, poisoned embeddings, and compromised model endpoints.
Fourteen agents. Five posture levels.
Start with LegacyBot at the Critical level to learn the basics. Work your way up to SecureBot at the Hardened level to see what proper input validation, output filtering, and capability boundaries actually buy you.
| Agent | Port | Protocol | Posture |
|---|---|---|---|
| SecureBot | :3001 | API | Hardened |
| HelperBot | :3002 | API | Weak |
| LegacyBot | :3003 | API | Critical |
| CodeBot | :3004 | API | Vulnerable |
| RAGBot | :3005 | API | Weak |
| VisionBot | :3006 | API | Weak |
| ToolBot | :3010 | MCP | Vulnerable |
| DataBot | :3011 | MCP | Weak |
| Orchestrator | :3020 | A2A | Standard |
| Worker | :3021 | A2A | Weak |
Twenty two challenges. Four levels.
Each challenge ships with a specific objective, target agent, and flag to capture. Total available is 5,900 points across the full challenge set.
Three protocol tiers. Clear port ranges.
Agents are grouped into three protocol tiers, each on its own port range. The dashboard runs separately on port 9000 for orchestration and CTF tracking.
API Agents
:3000 to 3006OpenAI API
SecureBot, HelperBot, LegacyBot, CodeBot, RAGBot, VisionBot
MCP Servers
:3010 to 3011MCP JSON-RPC
ToolBot, DataBot
A2A Agents
:3020 to 3021A2A Message
Orchestrator, Worker
Dashboard
:9000HTTP
Web UI
Four ways to get running.
Pick whichever fits your workflow. Docker Hub is the fastest path. The Node.js path is best when you want to read the agent source while you attack it.
Docker Hub
Fastest path. No clone, no build.
Docker Compose
Reproducible local stack with one command.
Node.js
Read the agent source while you attack.
OpenA2A CLI
One verb. Pulls, maps ports, starts.
Validate your security tools.
DVAA is the reference target the OpenA2A toolchain regression tests against. Use it to ground truth your own scanners. If a scanner gives LegacyBot a clean bill of health, the scanner is wrong.
Run HackMyAgent 0.22.2 against LegacyBot to confirm 209 static, 29 semantic, and 164 adversarial checks fire as expected.
Use opena2a-cli 0.10.2 with
benchmarkto run the OASB suite against the full agent fleet.Compare findings between SecureBot and LegacyBot to verify your scanner discriminates between hardened and vulnerable postures.
Treat any false negative on LegacyBot as a regression. Treat any false positive on SecureBot as a precision bug.
Start breaking AI agents.
Pull the image. Open the dashboard. Pick a target. The whole stack runs locally with no signup, no account, and no telemetry.
docker run opena2a/dvaa:0.8.0