Part of the security infrastructure for AI agents

Security toolkit for AI agents.

209 static checks, 29 semantic checks, 164 adversarial payloads, auto remediation with rollback.

$npx hackmyagent secure

Scans in seconds. Auto remediation with rollback. No signup.

View on GitHub Read the docs

What you get back.

One scan. Every finding tied to a check ID. Run with --fix to auto remediate.

Terminal

$ npx hackmyagent secure

HackMyAgent v0.23.0 · 209 static + 29 semantic checks

Findings

6 issues->5 auto fixable

Run npx hackmyagent secure --fix to auto remediate.

Detail

CRITICAL CRED-001 Hardcoded API key in config.json

CRITICAL MCP-003 MCP server with root filesystem access

HIGH NET-001 Server bound to 0.0.0.0

HIGH PERM-001 World readable secret files

MEDIUM GIT-002 Incomplete .gitignore patterns

LOW LOG-001 Missing audit trail configuration

Three modes of operation.

Scan for vulnerabilities. Attack with adversarial payloads. Fix with safe rollback.

Scan

209 static checks plus 29 semantic checks. Detect misconfigurations, hardcoded credentials, exposed endpoints, and supply chain risks.

Credential detection and scope drift
MCP server auditing
AI agent CVE detection
Governance file analysis (SOUL.md, ABGS)

Attack

164 adversarial payloads across 16 categories. Red team your agent with prompt injection, jailbreak, data exfiltration, capability abuse, context manipulation, and more.

Prompt injection (12 payloads)
Jailbreak (12 payloads)
Data exfiltration (11 payloads)
Custom payload support

Fix

Auto remediation with rollback. Dry run preview before applying changes. Automatic backups so any fix can be undone instantly.

Dry run preview mode
Automatic backup creation
One command rollback
Plugin based fix system

Coverage breakdown.

Static, semantic, and adversarial. Three layers across every domain HackMyAgent inspects.

209

Static checks

Semantic checks

164

Adversarial payloads

Credentials and secrets

Hardcoded API keys
Scope drift across providers
Plaintext .env files
Exposed tokens in configs

MCP and tool permissions

MCP server config audit
Root filesystem access
Tool permission boundaries
Skill manifest review

Network and runtime

Bind address exposure
Rate limiting
Process isolation
Sandboxing

Code and supply chain

Dependency vulnerabilities
AI agent CVE detection
Build artifact integrity
Signing and provenance

Governance and policy

SOUL.md scoring (ABGS)
OASB benchmark
Audit trail completeness
Policy enforcement

Identity and session

Authentication checks
Session lifetime
Encryption at rest
Heartbeat and liveness

Quick start.

No config files. Works out of the box. Six commands cover scan, fix, attack, benchmark, and rollback.

Terminal

# Scan everything
$ npx hackmyagent secure

# Scan and auto remediate
$ npx hackmyagent secure --fix

# Preview fixes before applying
$ npx hackmyagent secure --fix --dry-run

# Red team with adversarial payloads
$ npx hackmyagent attack --local

# Run OASB benchmark
$ npx hackmyagent secure -b oasb-1

# Rollback any changes
$ npx hackmyagent rollback

Attack categories.

164 adversarial payloads across 16 categories. Five most common patterns shown below.

Prompt Injection

12 payloads

Manipulate agent behavior via injected instructions in user input, retrieved documents, or tool output.

Jailbreak

12 payloads

Bypass safety guardrails and system constraints with persona shifts, hypothetical framing, and role escalation.

Data Exfiltration

11 payloads

Extract sensitive data, system prompts, credentials, or memory contents through indirect channels.

Capability Abuse

10 payloads

Misuse agent tools and capabilities for unintended actions outside the authorized scope.

Context Manipulation

10 payloads

Poison agent context, memory, or retrieval results to alter downstream decisions.

Supported targets.

Auto detects the host, then scans the right surfaces. No flags required.

Claude Code

CLAUDE.md, skills, MCP server configs

Cursor

.cursor/ rules, MCP configurations

VS Code

.vscode/mcp.json configurations

Generic MCP

Any MCP server setup

OASB benchmark compliance.

Run the Open Agent Security Benchmark (OASB-1) directly. 46 controls across 10 categories with three maturity levels.

Three maturity levels

L1 Essential

Foundational hardening

26 controls

L2 Standard

Production readiness

44 controls

L3 Hardened

High assurance

46 controls

Learn more about OASB

Terminal

$ npx hackmyagent secure -b oasb-1

  OASB-1 Benchmark Assessment
  Level: L1 Essential (26 controls)

  PASS  Identity and Provenance       4/4
  PASS  Capability and Authorization  5/5
  PASS  Input Security                5/5
  WARN  Output Security               3/4
  PASS  Credential Protection         5/5
  FAIL  Supply Chain Integrity        2/5

  Score: 84/100
  Rating: Passing

Secure your agents.

One command. No signup. Runs locally.

$npx hackmyagent secure

Star on GitHub Read the docs

Security toolkit for AI agents.

What you get back.

Three modes of operation.

Scan

Attack

Fix

Coverage breakdown.

Credentials and secrets

MCP and tool permissions

Network and runtime

Code and supply chain

Governance and policy

Identity and session

Quick start.

Attack categories.

Prompt Injection

Jailbreak

Data Exfiltration

Capability Abuse

Context Manipulation

Supported targets.

Claude Code

Cursor

VS Code

Generic MCP

OASB benchmark compliance.

10 assessment categories

Three maturity levels

Secure your agents.