Part of the security infrastructure for AI agents

Security toolkit for AI agents.

209 static checks, 29 semantic checks, 164 adversarial payloads, auto remediation with rollback.

$npx hackmyagent secure

Scans in seconds. Auto remediation with rollback. No signup.

What you get back.

One scan. Every finding tied to a check ID. Run with --fix to auto remediate.

Terminal
$ npx hackmyagent secure
HackMyAgent v0.23.0 ยท 209 static + 29 semantic checks
Findings
6 issues->5 auto fixable
Run npx hackmyagent secure --fix to auto remediate.
Detail
CRITICAL CRED-001 Hardcoded API key in config.json
CRITICAL MCP-003 MCP server with root filesystem access
HIGH NET-001 Server bound to 0.0.0.0
HIGH PERM-001 World readable secret files
MEDIUM GIT-002 Incomplete .gitignore patterns
LOW LOG-001 Missing audit trail configuration

Three modes of operation.

Scan for vulnerabilities. Attack with adversarial payloads. Fix with safe rollback.

Scan

209 static checks plus 29 semantic checks. Detect misconfigurations, hardcoded credentials, exposed endpoints, and supply chain risks.

  • Credential detection and scope drift
  • MCP server auditing
  • AI agent CVE detection
  • Governance file analysis (SOUL.md, ABGS)

Attack

164 adversarial payloads across 16 categories. Red team your agent with prompt injection, jailbreak, data exfiltration, capability abuse, context manipulation, and more.

  • Prompt injection (12 payloads)
  • Jailbreak (12 payloads)
  • Data exfiltration (11 payloads)
  • Custom payload support

Fix

Auto remediation with rollback. Dry run preview before applying changes. Automatic backups so any fix can be undone instantly.

  • Dry run preview mode
  • Automatic backup creation
  • One command rollback
  • Plugin based fix system

Coverage breakdown.

Static, semantic, and adversarial. Three layers across every domain HackMyAgent inspects.

209
Static checks
29
Semantic checks
164
Adversarial payloads

Credentials and secrets

  • Hardcoded API keys
  • Scope drift across providers
  • Plaintext .env files
  • Exposed tokens in configs

MCP and tool permissions

  • MCP server config audit
  • Root filesystem access
  • Tool permission boundaries
  • Skill manifest review

Network and runtime

  • Bind address exposure
  • Rate limiting
  • Process isolation
  • Sandboxing

Code and supply chain

  • Dependency vulnerabilities
  • AI agent CVE detection
  • Build artifact integrity
  • Signing and provenance

Governance and policy

  • SOUL.md scoring (ABGS)
  • OASB benchmark
  • Audit trail completeness
  • Policy enforcement

Identity and session

  • Authentication checks
  • Session lifetime
  • Encryption at rest
  • Heartbeat and liveness

Quick start.

No config files. Works out of the box. Six commands cover scan, fix, attack, benchmark, and rollback.

Terminal
# Scan everything
$ npx hackmyagent secure

# Scan and auto remediate
$ npx hackmyagent secure --fix

# Preview fixes before applying
$ npx hackmyagent secure --fix --dry-run

# Red team with adversarial payloads
$ npx hackmyagent attack --local

# Run OASB benchmark
$ npx hackmyagent secure -b oasb-1

# Rollback any changes
$ npx hackmyagent rollback

Attack categories.

164 adversarial payloads across 16 categories. Five most common patterns shown below.

Prompt Injection

12 payloads

Manipulate agent behavior via injected instructions in user input, retrieved documents, or tool output.

Jailbreak

12 payloads

Bypass safety guardrails and system constraints with persona shifts, hypothetical framing, and role escalation.

Data Exfiltration

11 payloads

Extract sensitive data, system prompts, credentials, or memory contents through indirect channels.

Capability Abuse

10 payloads

Misuse agent tools and capabilities for unintended actions outside the authorized scope.

Context Manipulation

10 payloads

Poison agent context, memory, or retrieval results to alter downstream decisions.

Supported targets.

Auto detects the host, then scans the right surfaces. No flags required.

Claude Code

CLAUDE.md, skills, MCP server configs

Cursor

.cursor/ rules, MCP configurations

VS Code

.vscode/mcp.json configurations

Generic MCP

Any MCP server setup

OASB benchmark compliance.

Run the Open Agent Security Benchmark (OASB-1) directly. 46 controls across 10 categories with three maturity levels.

10 assessment categories

  • Identity and Provenance
  • Capability and Authorization
  • Input Security
  • Output Security
  • Credential Protection
  • Supply Chain Integrity
  • Agent to Agent Security
  • Memory and Context Integrity
  • Operational Security
  • Monitoring and Response

Three maturity levels

L1 Essential
Foundational hardening
26 controls
L2 Standard
Production readiness
44 controls
L3 Hardened
High assurance
46 controls
Learn more about OASB
Terminal
$ npx hackmyagent secure -b oasb-1

  OASB-1 Benchmark Assessment
  Level: L1 Essential (26 controls)

  PASS  Identity and Provenance       4/4
  PASS  Capability and Authorization  5/5
  PASS  Input Security                5/5
  WARN  Output Security               3/4
  PASS  Credential Protection         5/5
  FAIL  Supply Chain Integrity        2/5

  Score: 84/100
  Rating: Passing

Secure your agents.

One command. No signup. Runs locally.

$npx hackmyagent secure