OASB

Open Agent Security Benchmark. 222 standardized attack scenarios for security scoring and compliance.

OASB provides a repeatable, standardized method for measuring the security posture of AI agents. It defines 222 attack scenarios organized into 30 security categories, each targeting a specific weakness in how agents handle prompts, tools, data, and inter-agent communication. By running the full benchmark against an agent endpoint, you get a normalized score (0-100) that quantifies how well the agent defends against known attack patterns -- and a category-level breakdown that identifies exactly where to focus remediation.

Installation

npm install -g hackmyagent

Via Homebrew

brew install opena2a-org/tap/hackmyagent

Via OpenA2A CLI

opena2a benchmark

How Benchmarking Works

OASB connects to an agent's HTTP endpoint and executes each attack scenario sequentially. For every scenario, the benchmark:

Sends the attack payload to the agent endpoint
Analyzes the agent's response for indicators of successful defense or exploitation
Records the result as pass (defended) or fail (exploited) with supporting evidence
Aggregates results into per-category and overall scores

The benchmark is deterministic -- running it twice against the same agent with the same configuration produces the same score. This makes it suitable for CI/CD integration and regression tracking over time.

Attack Scenario Categories

The 222 scenarios span 30 categories. The six primary categories are:

Prompt Injection

Direct, indirect, and multi-turn prompt injection attacks.

Tool Misuse

Unauthorized tool invocation, parameter manipulation.

Data Exfiltration

Information leakage through various channels.

Privilege Escalation

Attempts to gain unauthorized capabilities.

Denial of Service

Resource exhaustion and infinite loop attacks.

Supply Chain

Dependency confusion and MCP server spoofing.

Additional categories include credential handling, output sanitization, context window manipulation, excessive agency, MCP protocol abuse, A2A delegation attacks, and more. See the OASB repository for the complete scenario catalog.

Scoring Methodology

OASB produces a normalized security score from 0 to 100 based on the percentage of attack scenarios the agent successfully defends against. Scores are broken down by category for targeted remediation.

Score Range	Interpretation
80-100	Strong defenses across most categories. Focus on remaining gaps.
50-79	Moderate coverage with clear areas for improvement. Category breakdown identifies priorities.
0-49	Significant exposure. Review category-level results and address high-severity categories first.

Comparing Agents

Because OASB scores are normalized and deterministic, you can use them to compare the security posture of different agents or track a single agent's improvement over time. Export JSON reports from multiple benchmark runs and diff the per-category scores to identify regressions or improvements after configuration changes.

Relationship to HMA Scans

HackMyAgent performs static configuration analysis (checking for hardcoded credentials, insecure MCP configs, missing governance files). OASB complements this with dynamic runtime testing -- sending actual attack payloads to a running agent. Use HMA for pre-deployment checks and OASB for validating runtime behavior.

Usage

# Run full benchmark

oasb benchmark http://localhost:3000

# Run a single category

oasb benchmark http://localhost:3000 --category prompt-injection

# JSON report for compliance

oasb benchmark http://localhost:3000 --format json > oasb-report.json

# Via OpenA2A CLI

opena2a benchmark http://localhost:3000

GitHub Repository CLI Integration HackMyAgent docs Practice with DVAA