Security toolkit for AI agents.
209 static checks, 29 semantic checks, 164 adversarial payloads, auto remediation with rollback.
npx hackmyagent secureScans in seconds. Auto remediation with rollback. No signup.
What you get back.
One scan. Every finding tied to a check ID. Run with --fix to auto remediate.
Three modes of operation.
Scan for vulnerabilities. Attack with adversarial payloads. Fix with safe rollback.
Scan
209 static checks plus 29 semantic checks. Detect misconfigurations, hardcoded credentials, exposed endpoints, and supply chain risks.
- Credential detection and scope drift
- MCP server auditing
- AI agent CVE detection
- Governance file analysis (SOUL.md, ABGS)
Attack
164 adversarial payloads across 16 categories. Red team your agent with prompt injection, jailbreak, data exfiltration, capability abuse, context manipulation, and more.
- Prompt injection (12 payloads)
- Jailbreak (12 payloads)
- Data exfiltration (11 payloads)
- Custom payload support
Fix
Auto remediation with rollback. Dry run preview before applying changes. Automatic backups so any fix can be undone instantly.
- Dry run preview mode
- Automatic backup creation
- One command rollback
- Plugin based fix system
Coverage breakdown.
Static, semantic, and adversarial. Three layers across every domain HackMyAgent inspects.
Credentials and secrets
- Hardcoded API keys
- Scope drift across providers
- Plaintext .env files
- Exposed tokens in configs
MCP and tool permissions
- MCP server config audit
- Root filesystem access
- Tool permission boundaries
- Skill manifest review
Network and runtime
- Bind address exposure
- Rate limiting
- Process isolation
- Sandboxing
Code and supply chain
- Dependency vulnerabilities
- AI agent CVE detection
- Build artifact integrity
- Signing and provenance
Governance and policy
- SOUL.md scoring (ABGS)
- OASB benchmark
- Audit trail completeness
- Policy enforcement
Identity and session
- Authentication checks
- Session lifetime
- Encryption at rest
- Heartbeat and liveness
Quick start.
No config files. Works out of the box. Six commands cover scan, fix, attack, benchmark, and rollback.
# Scan everything $ npx hackmyagent secure # Scan and auto remediate $ npx hackmyagent secure --fix # Preview fixes before applying $ npx hackmyagent secure --fix --dry-run # Red team with adversarial payloads $ npx hackmyagent attack --local # Run OASB benchmark $ npx hackmyagent secure -b oasb-1 # Rollback any changes $ npx hackmyagent rollback
Attack categories.
164 adversarial payloads across 16 categories. Five most common patterns shown below.
Prompt Injection
Manipulate agent behavior via injected instructions in user input, retrieved documents, or tool output.
Jailbreak
Bypass safety guardrails and system constraints with persona shifts, hypothetical framing, and role escalation.
Data Exfiltration
Extract sensitive data, system prompts, credentials, or memory contents through indirect channels.
Capability Abuse
Misuse agent tools and capabilities for unintended actions outside the authorized scope.
Context Manipulation
Poison agent context, memory, or retrieval results to alter downstream decisions.
Supported targets.
Auto detects the host, then scans the right surfaces. No flags required.
Claude Code
CLAUDE.md, skills, MCP server configsCursor
.cursor/ rules, MCP configurationsVS Code
.vscode/mcp.json configurationsGeneric MCP
Any MCP server setupOASB benchmark compliance.
Run the Open Agent Security Benchmark (OASB-1) directly. 46 controls across 10 categories with three maturity levels.
10 assessment categories
- Identity and Provenance
- Capability and Authorization
- Input Security
- Output Security
- Credential Protection
- Supply Chain Integrity
- Agent to Agent Security
- Memory and Context Integrity
- Operational Security
- Monitoring and Response
Three maturity levels
$ npx hackmyagent secure -b oasb-1 OASB-1 Benchmark Assessment Level: L1 Essential (26 controls) PASS Identity and Provenance 4/4 PASS Capability and Authorization 5/5 PASS Input Security 5/5 WARN Output Security 3/4 PASS Credential Protection 5/5 FAIL Supply Chain Integrity 2/5 Score: 84/100 Rating: Passing
Secure your agents.
One command. No signup. Runs locally.
npx hackmyagent secure