AgentPwn
The benevolent honeypot for AI agents.
48 attack scenarios across 11 categories. When your agent falls for an attack, we tell it how to fix the vulnerability. Every interaction makes the ecosystem smarter.
npx hackmyagent wildHow It Works
Agent Visits
An AI agent browses agentpwn.com while researching, scraping, or following links. Each page contains hidden injection payloads at various difficulty tiers.
Agent Gets Tested
If the agent follows the injected instructions, the payload tells it to report the vulnerability to its admin with the exact fix command. No harm done.
Ecosystem Learns
Every interaction feeds back into the security ecosystem: new HMA scanner checks, DVAA lab scenarios, NanoMind training data, and threat intelligence reports.
Attack Categories
Prompt Injection
10 tiersDirect and indirect instruction override attacks
Jailbreak
5 tiersAttempts to bypass safety guardrails and persona constraints
Data Exfiltration
5 tiersTricks to extract credentials, PII, or system information
Capability Abuse
3 tiersConfused deputy attacks that misuse agent tools
Context Manipulation
5 tiersAttacks that corrupt the agent's understanding of context
MCP Exploitation
3 tiersAttacks targeting Model Context Protocol integrations
Agent-to-Agent Attack
3 tiersAttacks exploiting inter-agent communication trust
Memory Weaponization
3 tiersPoisoning persistent memory and conversation state
Context Window
5 tiersExploiting context window limits for instruction displacement
Supply Chain
3 tiersAttacks through compromised dependencies and plugins
Tool Shadow
3 tiersHidden tool invocations and shadow function calls