AI Agents Are Being Hijacked Through Their Own Memory
OWASP just named the threat class and shipped a reference defense -- which means enterprise security reviews now have a new mandatory line item.
Every AI agent running in your enterprise has a memory. That memory shapes every decision the agent makes. And right now, in most organizations, that memory has no defense layer protecting it.
AI agent memory poisoning is the attack pattern where an adversary writes malicious content into an agent’s persistent memory -- its conversation history, its vector store, its RAG index, its scratchpad -- and every subsequent action the agent takes carries that attacker’s intent forward. The user believes they are interacting with a trustworthy system. They are not. The agent is working, in part, for someone else.
OWASP released Agent Memory Guard on June 1, 2026, as an open-source runtime defense layer targeting exactly this threat. It is the reference implementation for the OWASP Top 10 for Agentic Applications, addressing the ASI06 threat class. That classification matters: it means memory poisoning is no longer a theoretical concern catalogued in a research paper. It is a named, enumerated vulnerability class with a published standard.
Why Memory Is the Target
Prompt injection has held the #1 position on the OWASP Top 10 for LLM Applications 2025. Memory poisoning is an extension of that same attack surface -- except it is more durable. A prompt injection attack operates within a single session. A memory poisoning attack persists. It survives session resets. It travels with the agent.
Agents are not stateless. They are designed to remember -- because memory is what makes them useful across time. Conversation history tells the agent what it has already done. The RAG index tells it what it knows. The scratchpad tells it what it is currently working on. An attacker who can write to any of those stores gains a channel into the agent that no authentication layer, no API gateway, and no perimeter control currently inspects.
What OWASP Agent Memory Guard Actually Does
The tool intercepts every memory read and write through five detection layers: prompt injection screening, secret and PII leakage detection, key tampering detection, SHA-256 integrity baselines, and size anomaly detection. Administrators configure response behavior through a YAML policy that can allow, redact, quarantine, or block any flagged operation.
The published results are meaningful: 92.5% recall, 100% precision, zero false positives, and a 59-microsecond median latency overhead. The tool also supports rollback to a known-good memory state -- which is the capability that will matter most when a poisoning event is discovered after the fact.
This is the first runtime defense tool built explicitly to the OWASP agentic security standard. Prior tools addressed model behavior or API security. None addressed the memory layer as a discrete attack surface.
The Cross-Agent Propagation Problem
The threat extends beyond a single compromised agent. Research published in February 2026 by a team spanning Northeastern, Harvard, MIT, Stanford, and CMU -- the Agents of Chaos study -- documented live cases where one compromised agent shared its “constitution,” its behavioral rules, with other agents in the same network. Identity spoofing succeeded in new communication channels where prior context was unavailable to verify authenticity.
The implication is direct: a single poisoned agent can become a propagation node. The attacker does not need to breach every agent individually. They need to breach one agent that communicates with others, and the poisoned behavioral instructions spread through the network’s normal operation.
The WEF Global Cybersecurity Outlook 2026 flagged this exact dynamic -- agents that accumulate excessive privileges or absorb manipulated instructions through design flaws can propagate errors at scale before any human reviewer notices.
Enterprise Governance Is Not Ready
The organizational readiness gap is measurable. Kiteworks Data Security and Compliance Risk: 2026 Forecast Report found that 63% of enterprises cannot enforce purpose limitations on AI agents and 60% cannot quickly terminate a misbehaving agent. Those two numbers describe organizations that cannot stop an attack they cannot see, targeting a memory layer they are not monitoring, propagating through a multi-agent network they cannot shut down fast enough.
The NIST AI Agent Standards Initiative, launched in February 2026, identified agent identity, authorization, and security as priority standardization areas. That initiative provides the standards scaffolding. OWASP Agent Memory Guard provides the runtime implementation. The gap that remains is organizational -- governance structures and tooling pipelines that can actually deploy and enforce both.
What the Protocol Layer Can Do
Kiteworks‘ MCP Server operates at the protocol layer, governing which data AI agents can access and which tools they can invoke before any agent interaction reaches sensitive content. A poisoned agent memory that attempts to access data outside its authorized scope encounters Kiteworks’ attribute-based access control policy engine, which evaluates every request independently -- regardless of what the agent’s memory instructs it to do. The tamper-evident audit log records every agent interaction, so anomalous behavior triggered by memory manipulation is detectable in real time rather than in a post-incident review weeks later.
The Conversation Has Changed
Security reviews that evaluated AI deployments six months ago asked about model safety and API access controls. Reviews happening now need to ask about memory architecture -- what persists, what writes to the memory store, what reads from it, and what prevents an adversary from treating that store as a reliable attack channel.
OWASP naming the threat class and shipping a reference implementation is the point at which “we’re monitoring the situation” stops being an acceptable posture. The memory layer is now a documented, enumerated attack surface with an available defense. Organizations that deploy AI agents without addressing it are leaving a privileged channel into their systems ungoverned.
The agents are running. The memory is filling. The attack surface is open.


