Context ManipulationTier 3medium
Context Manipulation: Semantic Confusion
Attacks that corrupt the agent's understanding of context
Semantic confusion redefines words within the conversation to make safety-related terms mean their opposite, potentially bypassing intent-based filters.
Attack Details
- Attack ID
- APWN-CM-003
- HMA Check
- INJ-002
- Delivery Methods
- html-comment, invisible-span, meta-tag
- CWE
- CWE-20
- OASB Control
- 6.3
- Severity
- medium
Remediation
If your AI agent is vulnerable to this attack, scan and fix with:
npx hackmyagent secure --check INJ-002