OASB

Open Agent Security Benchmark. 222 standardized attack scenarios for security scoring and compliance.

OASB provides a repeatable, standardized method for measuring the security posture of AI agents. It defines 222 attack scenarios organized into 30 security categories, each targeting a specific weakness in how agents handle prompts, tools, data, and inter-agent communication. By running the full benchmark against an agent endpoint, you get a normalized score (0-100) that quantifies how well the agent defends against known attack patterns -- and a category-level breakdown that identifies exactly where to focus remediation.

Installation

npm install -g hackmyagent
Via Homebrew
brew install opena2a-org/tap/hackmyagent
Via OpenA2A CLI
opena2a benchmark

How Benchmarking Works

OASB connects to an agent's HTTP endpoint and executes each attack scenario sequentially. For every scenario, the benchmark:

  1. Sends the attack payload to the agent endpoint
  2. Analyzes the agent's response for indicators of successful defense or exploitation
  3. Records the result as pass (defended) or fail (exploited) with supporting evidence
  4. Aggregates results into per-category and overall scores

The benchmark is deterministic -- running it twice against the same agent with the same configuration produces the same score. This makes it suitable for CI/CD integration and regression tracking over time.

Attack Scenario Categories

The 222 scenarios span 30 categories. The six primary categories are:

Prompt Injection

Direct, indirect, and multi-turn prompt injection attacks.

Tool Misuse

Unauthorized tool invocation, parameter manipulation.

Data Exfiltration

Information leakage through various channels.

Privilege Escalation

Attempts to gain unauthorized capabilities.

Denial of Service

Resource exhaustion and infinite loop attacks.

Supply Chain

Dependency confusion and MCP server spoofing.

Additional categories include credential handling, output sanitization, context window manipulation, excessive agency, MCP protocol abuse, A2A delegation attacks, and more. See the OASB repository for the complete scenario catalog.

Scoring Methodology

OASB produces a normalized security score from 0 to 100 based on the percentage of attack scenarios the agent successfully defends against. Scores are broken down by category for targeted remediation.

Score RangeInterpretation
80-100Strong defenses across most categories. Focus on remaining gaps.
50-79Moderate coverage with clear areas for improvement. Category breakdown identifies priorities.
0-49Significant exposure. Review category-level results and address high-severity categories first.

Comparing Agents

Because OASB scores are normalized and deterministic, you can use them to compare the security posture of different agents or track a single agent's improvement over time. Export JSON reports from multiple benchmark runs and diff the per-category scores to identify regressions or improvements after configuration changes.

Relationship to HMA Scans

HackMyAgent performs static configuration analysis (checking for hardcoded credentials, insecure MCP configs, missing governance files). OASB complements this with dynamic runtime testing -- sending actual attack payloads to a running agent. Use HMA for pre-deployment checks and OASB for validating runtime behavior.

Usage

# Run full benchmark
oasb benchmark http://localhost:3000
# Run a single category
oasb benchmark http://localhost:3000 --category prompt-injection
# JSON report for compliance
oasb benchmark http://localhost:3000 --format json > oasb-report.json
# Via OpenA2A CLI
opena2a benchmark http://localhost:3000