OASB
Open Agent Security Benchmark. 222 standardized attack scenarios for security scoring and compliance.
OASB provides a repeatable, standardized method for measuring the security posture of AI agents. It defines 222 attack scenarios organized into 30 security categories, each targeting a specific weakness in how agents handle prompts, tools, data, and inter-agent communication. By running the full benchmark against an agent endpoint, you get a normalized score (0-100) that quantifies how well the agent defends against known attack patterns -- and a category-level breakdown that identifies exactly where to focus remediation.
Installation
npm install -g hackmyagentbrew install opena2a-org/tap/hackmyagentopena2a benchmarkHow Benchmarking Works
OASB connects to an agent's HTTP endpoint and executes each attack scenario sequentially. For every scenario, the benchmark:
- Sends the attack payload to the agent endpoint
- Analyzes the agent's response for indicators of successful defense or exploitation
- Records the result as pass (defended) or fail (exploited) with supporting evidence
- Aggregates results into per-category and overall scores
The benchmark is deterministic -- running it twice against the same agent with the same configuration produces the same score. This makes it suitable for CI/CD integration and regression tracking over time.
Attack Scenario Categories
The 222 scenarios span 30 categories. The six primary categories are:
Prompt Injection
Direct, indirect, and multi-turn prompt injection attacks.
Tool Misuse
Unauthorized tool invocation, parameter manipulation.
Data Exfiltration
Information leakage through various channels.
Privilege Escalation
Attempts to gain unauthorized capabilities.
Denial of Service
Resource exhaustion and infinite loop attacks.
Supply Chain
Dependency confusion and MCP server spoofing.
Additional categories include credential handling, output sanitization, context window manipulation, excessive agency, MCP protocol abuse, A2A delegation attacks, and more. See the OASB repository for the complete scenario catalog.
Scoring Methodology
OASB produces a normalized security score from 0 to 100 based on the percentage of attack scenarios the agent successfully defends against. Scores are broken down by category for targeted remediation.
| Score Range | Interpretation |
|---|---|
| 80-100 | Strong defenses across most categories. Focus on remaining gaps. |
| 50-79 | Moderate coverage with clear areas for improvement. Category breakdown identifies priorities. |
| 0-49 | Significant exposure. Review category-level results and address high-severity categories first. |
Comparing Agents
Because OASB scores are normalized and deterministic, you can use them to compare the security posture of different agents or track a single agent's improvement over time. Export JSON reports from multiple benchmark runs and diff the per-category scores to identify regressions or improvements after configuration changes.
Relationship to HMA Scans
HackMyAgent performs static configuration analysis (checking for hardcoded credentials, insecure MCP configs, missing governance files). OASB complements this with dynamic runtime testing -- sending actual attack payloads to a running agent. Use HMA for pre-deployment checks and OASB for validating runtime behavior.
Usage
oasb benchmark http://localhost:3000oasb benchmark http://localhost:3000 --category prompt-injectionoasb benchmark http://localhost:3000 --format json > oasb-report.jsonopena2a benchmark http://localhost:3000