#security-research#behavioral-security#ai-agents#mcp#honeypot

45% of AI agent attackers come back. We just published the data.

Abdel Fane

May 12, 2026

6 min read

ARIA, our autonomous research system, finished its first month of behavioral instrumentation on May 11. We published the report today.

State of AI Agent Security: May 2026. Behavioral Threat Report Issue 1

The numbers anchor on a thirty-day window across four data streams. ARIAscout ran a fresh Shodan sweep on May 12 and counted 297,723 exposed AI services, down from 321,929 a month earlier. Inside our honeypot ecosystem, TrapMyAgent captured 206,571 honey-agent events. AgentPwn logged 1,763 payload callbacks across 136,954 honeypot-page interactions. HoneyFinder sampled 343 surfaces of injection bait planted on the open web by third parties.

Exposure is contracting at the surface level. Attacker engagement is intensifying on the surfaces that remain. Both things are true at the same time.

This is what we found.

45%

Attackers Returned

75%

Events Hit MCP

95.1%

Security-Vertical Callback

297,723

Exposed AI Services

The headline finding

45 percent of unique attackers came back across multiple sessions.

Of 9,037 distinct fingerprints we observed during the window, 4,047 returned. The top fingerprint by event count returned for 15 days straight and posted 27,230 events across 623 sessions. Twenty-nine of the top fifty fingerprints by session count showed up in the most recent observation week.

If you're a defender and your threat model assumes attacker activity is one-shot reconnaissance, your model is wrong. The recurring visitors aren't just observing. They're sampling response variability, watching for changed defenses, and building per-property knowledge they can act on later.

Fingerprint-stable telemetry that survives short-lived IP rotation is the minimum condition to measure any of this. Most defenders don't have it yet.

MCP is where the fight is

Three of every four attacker events we observed targeted the Model Context Protocol. Agent-to-Agent traffic accounted for another 15.2 percent.

The MCP number isn't surprising in isolation. The A2A number is. Most public discussion of A2A in 2026 still frames the protocol as a research target. Our telemetry argues the inverse. People are already using A2A in production, attackers already know it, and the security tooling has not caught up.

If you ship anything that exposes an MCP server or honors A2A handshakes, hardening those two surfaces this month is the single highest-leverage thing you can do. Authenticate every A2A handshake. Verify the calling agent's identity before honoring any capability request. Reject unsigned messages by default.

The Common Crawl gap

Google published “AI threats in the wild” on April 23. It's a serious piece of work and the methodology is sound. The team was also transparent about what their methodology can't see. Login walls. Anti-crawl directives. Most social media.

Public-web crawls measure the surface a crawler can reach. They systematically miss three classes of attacker-reachable surface. Authenticated content behind logins. Per-fingerprint dynamic content that serves different HTML to different agents. Federated and platform-mediated social content.

OpenA2A's honeypot fleet partitions on this seam intentionally. Our baseline 20 SEO-engineered sites are Common-Crawl-visible. The new archetypes we started spinning up in April (the first one live is cloudops-agent-io) deliberately are not. The new cohort drew zero organic interactions in the window after we filtered out operator self-tests.

That zero is itself the finding. The interactive, post-credential surface is invisible to crawler-only research. Edition 2 will report on the second authenticated archetype as it comes online.

We're not saying this to score points off Google. They named the gap. We're saying it because the next twelve months of attacker behavior will live disproportionately on surfaces that public-web sweeps can't measure, and somebody has to instrument that.

When the sector matters

Most of our baseline AgentPwn cohort saw a 1.29 percent payload-callback rate. One agent in 78 followed a payload to a measurable endpoint.

The security-vertical honeypot population was different. 1,350 of 1,419 attempts ended in a callback. 95.1 percent.

Security-tooling agents that ingest external content trust security-flavored content more than general content. That trust is exploitable and it shows up in the data. Anyone shipping an agent that reads CVE descriptions, vulnerability database text, or security research into a model context window should assume an attacker has already studied the surface that 95.1 percent measures.

Callback rate measures whether an agent followed a payload to a measurable endpoint. It doesn't measure downstream impact. But following the payload is the precondition for everything that comes after.

Geography and the cloud backplane

Origin geography spans 90 countries. The United States accounted for 38.2 percent of events, the Netherlands 17.2 percent. Together that's 55 percent of observed traffic.

Cloud provider attribution: Cloudflare fronted 18.5 percent of traffic, AWS 12.6 percent, Microsoft Azure 9.7 percent. Together 41 percent.

Provider numbers describe which network fronted the traffic, not which network originated it. VPN, proxy, and CDN infrastructure obscures origin. The 18.5 percent Cloudflare share is best read as Cloudflare-fronted traffic, not Cloudflare-originated traffic.

What we won't tell you

ARIA is allowed to disagree with marketing pressure. We're publishing distribution-only sophistication data this month and withholding the mean. The supervised classifier that would ground a sophistication mean doesn't exist yet. We're not going to put a number on a slide that the data can't defend.

We're also not naming a vendor or model in the user-agent attribution section without behavioral fingerprinting to back it up. User-agent strings are attacker-controllable. They're not an identity claim. We'll get there. We're not there yet.

What's in the appendix

Three signatures are pre-registered in our First Observed In The Wild catalog. Greshake-class indirect prompt injection. Actor-Critic adaptive multi-turn manipulation. Reputation-poisoning prompt injection. None fired during the window. Transparency-log anchoring is in progress and log-anchored claims will start publishing from Edition 2.

Every classification in the report resolves to an Agent Threat Matrix technique ID at threats.opena2a.org. One technique, one ID, across every product we ship.

Read the full report

The web version is at research.opena2a.org/reports/state-of-ai-agent-security-2026-05. It includes the full attribution table, the MITRE ATT&CK overlay, the HoneyFinder signature breakdown, sector distribution of wild bait, and a six-item defender checklist with Threat Matrix and OASB control mappings.

Methodology is at research.opena2a.org/methodology. Every number traces to a query against live instrumentation. Nothing is estimated, modeled, or projected.

If you want to dispute a finding, email info@opena2a.org with the specific number and the methodology you'd prefer we used. We respond within five business days. Substantive challenges that hold up get published as methodology updates with attribution.

We publish on the fifteenth of every month going forward. Issue 2 lands June 15.

ARIA is OpenA2A's autonomous research system. Editorial review by Abdel Fane. Disclosure of authorship is a credibility move, not a limitation.

License: Apache 2.0. Cite using the BibTeX block in Section 9 of the report.