Real-World Attacks Behind the OWASP Agentic AI Top 10: Lessons from 2025
Agentic AI has leapt from the lab into the heart of modern workflows, powering tools like Claude Desktop, Amazon Q, and GitHub Copilot. This shift has not only transformed productivity but also opened the floodgates to a new breed of cyber threats. Attackers are no longer just probing static code—they’re hijacking autonomous agents, manipulating objectives, and exploiting the very trust that makes these systems so powerful. In 2025, real-world incidents—from malicious npm packages with thousands of downloads to high-severity vulnerabilities in widely used AI extensions—have exposed the unique risks of agentic AI (BleepingComputer).
This analysis dives into the actual attacks that shaped the OWASP Agentic AI Top 10, revealing how instruction injection, toolchain poisoning, and dynamic supply chain threats are redefining what it means to secure AI. With agents autonomously fetching content, executing code, and making decisions—often with minimal human oversight—the stakes have never been higher. Let’s unpack the real-world scenarios, industry responses, and what’s next for agentic AI security (BleepingComputer).
Real-World Attack Scenarios Shaping Agentic AI Security
Escalation of Threats: From Research to Production Environments
The rapid transition of agentic AI from controlled research environments to widespread production deployment has dramatically shifted the threat landscape. In 2025, agentic AI systems such as Claude Desktop, Amazon Q, and GitHub Copilot became integral to developer workflows, managing sensitive tasks like email handling, workflow automation, and direct code execution (BleepingComputer). This mainstream adoption has made agentic AI a lucrative target for attackers, who have exploited the implicit trust and broad system access these agents possess. Unlike traditional applications, agentic AI operates autonomously, fetching and executing content, making decisions, and interacting with critical infrastructure—often with limited human oversight.
Attackers have quickly adapted, leveraging the unique capabilities and vulnerabilities of agentic AI. The traditional security measures—such as static code analysis, signature-based detection, and perimeter controls—have proven insufficient for these dynamic, autonomous systems. As a result, security teams have faced a surge in sophisticated attacks that exploit the very features that make agentic AI powerful.
Instruction Injection and Objective Manipulation
A defining class of attacks against agentic AI involves manipulating the agent’s objectives through instruction injection. Unlike standard prompt injection seen in language models, these attacks target the agent’s autonomous decision-making process, redirecting its goals and actions.
For example, in November 2025, researchers uncovered a malicious npm package that had been live for two years with over 17,000 downloads (BleepingComputer). This package contained hidden instructions designed to manipulate AI agents that interacted with it. When an agent processed content from the compromised package, it unknowingly executed malicious objectives embedded within, such as exfiltrating credentials or altering system configurations. The attack’s effectiveness stemmed from the agent’s inability to distinguish between legitimate commands and maliciously crafted instructions within the data it processed.
This scenario underscores the risk of “Agent Goal Hijack,” where attackers embed instructions in seemingly benign content, causing agents to act against organizational interests. The consequences range from data theft to the execution of destructive commands, all triggered by the agent’s autonomous processing of external inputs.
Toolchain Poisoning and Malicious Automation
Another prominent attack vector involves the misuse of legitimate tools by agentic AI, often as a result of poisoned codebases or manipulated initialization routines. In July 2025, Amazon’s AI coding assistant, Amazon Q, was compromised through a malicious pull request that injected destructive instructions into its operational code (BleepingComputer). The injected code instructed the agent to perform actions such as cleaning systems to a near-factory state and deleting file-system and cloud resources using AWS CLI commands.
The attack was facilitated by the initialization code, which included flags like --trust-all-tools --no-interactive, effectively bypassing all confirmation prompts and allowing the agent to execute destructive commands without human intervention. During the five days the compromised extension was live, it was installed by over a million developers. Although Amazon reported that the extension was not functional during this period, the potential impact highlights the scale and speed at which agentic AI attacks can propagate.
This scenario exemplifies the risk of “Tool Misuse & Exploitation,” where agents are manipulated into using trusted tools for malicious purposes. The automation and autonomy of agentic AI amplify the impact of such attacks, as a single compromised agent can trigger widespread damage across cloud infrastructure and developer environments.
Runtime Code Execution Vulnerabilities
Agentic AI systems are inherently designed to execute code, which, while enabling powerful automation, also introduces significant security risks. In November 2025, three remote code execution (RCE) vulnerabilities were disclosed in Claude Desktop’s official extensions for Chrome, iMessage, and Apple Notes (BleepingComputer). These vulnerabilities stemmed from unsanitized command injection in AppleScript execution within the extensions—all developed and promoted by Anthropic.
The attack chain was straightforward yet devastating: a user would ask Claude a seemingly innocuous question, prompting it to search the web. If one of the search results was an attacker-controlled page containing hidden instructions, Claude would process the page, trigger the vulnerable extension, and execute the injected code with full system privileges. Sensitive assets such as SSH keys, AWS credentials, and browser passwords could be compromised simply by interacting with the AI assistant.
Anthropic confirmed all three vulnerabilities as high-severity (CVSS 8.9), and patches were rapidly deployed. However, the incident highlighted a critical reality: when agents can execute code, every input—regardless of its apparent legitimacy—becomes a potential attack vector.
Dynamic Supply Chain Attacks Targeting Agentic AI
Traditional supply chain attacks focus on static dependencies, but agentic AI introduces a new dimension: dynamic, runtime supply chain threats. Attackers now target the plugins, MCP (Model Control Plane) servers, and external tools that agents load and interact with during operation (BleepingComputer).
In September 2025, the first malicious MCP server was discovered in the wild, cited in the OWASP Agentic AI Top 10 exploit tracker. This server was designed to appear as a legitimate component, but once integrated, it delivered poisoned models and plugins to connected agents. The compromised MCP server could manipulate agent behavior, exfiltrate data, or introduce further vulnerabilities into the operational environment.
This form of attack is particularly insidious because it leverages the dynamic nature of agentic AI ecosystems. Agents routinely fetch and execute new plugins and models at runtime, often with minimal provenance checks. Attackers exploit this trust, introducing malicious components that are difficult to detect through traditional static analysis or signature-based methods.
Human Trust Exploitation and Over-Reliance on Agentic AI
Agentic AI systems are increasingly positioned as trusted advisors and autonomous decision-makers, leading to a growing risk of human-agent trust exploitation. Attackers capitalize on user over-reliance, crafting scenarios where agents provide recommendations or take actions that users accept without sufficient scrutiny.
For instance, prompt-based attacks can subtly influence agent outputs, causing users to follow malicious advice or approve harmful actions. In environments where agents are granted broad privileges—such as access to cloud resources, sensitive files, or critical infrastructure—the consequences of misplaced trust can be severe (BleepingComputer).
The risk is compounded by the opaque nature of many agentic AI decisions. Users may not fully understand the reasoning behind an agent’s recommendation, making it difficult to distinguish between legitimate and manipulated outputs. This dynamic creates fertile ground for attackers to exploit both technical and psychological vulnerabilities, leading to cascading failures across systems and organizations.
Adaptive Defense Strategies in Response to Evolving Threats
The escalation of real-world attacks has driven the development of adaptive defense strategies tailored to agentic AI. Security teams are increasingly focused on runtime monitoring, behavioral analysis, and rapid response mechanisms to mitigate the unique risks posed by autonomous agents.
Key defensive measures include:
- Comprehensive Inventory and Provenance Checks: Organizations are urged to maintain up-to-date inventories of all MCP servers, plugins, and tools used by their agents. Verifying the provenance of each component—preferably through signed packages from trusted publishers—reduces the risk of supply chain compromise (BleepingComputer).
- Principle of Least Privilege: Agents should operate with the minimum necessary privileges, avoiding broad credentials that could enable lateral movement or privilege escalation in the event of compromise.
- Behavioral Monitoring: Static analysis is insufficient for detecting runtime attacks. Continuous monitoring of agent behavior—tracking actions, decisions, and interactions—enables the early detection of anomalies indicative of compromise.
- Rapid Containment Mechanisms: The ability to quickly shut down or isolate compromised agents is critical. Organizations are advised to implement “kill switches” and automated containment protocols to limit the blast radius of successful attacks.
These adaptive strategies reflect a shift from traditional, perimeter-focused security to a more dynamic, agent-centric approach, recognizing the unique operational realities of autonomous AI systems.
Emerging Patterns and Industry Collaboration
The real-world attack scenarios documented in 2025 have catalyzed industry-wide collaboration around agentic AI security. The release of the OWASP Agentic AI Top 10 provides a shared vocabulary and framework for understanding and mitigating these risks (BleepingComputer). By categorizing and naming the most pressing threats—such as Agent Goal Hijack, Tool Misuse & Exploitation, and Supply Chain Vulnerabilities—the framework enables coordinated defense efforts across vendors, researchers, and security teams.
Notably, two of the real-world attack scenarios described above were directly cited in the OWASP Agentic AI Top 10, underscoring the practical relevance and urgency of these threats. The rapid response to vulnerabilities in widely used systems like Claude Desktop and Amazon Q demonstrates the industry’s capacity for collective action, but also highlights the need for ongoing vigilance as attackers continue to innovate.
Future Directions: Anticipating the Next Wave of Agentic AI Attacks
As agentic AI systems become more deeply embedded in organizational workflows, attackers are expected to develop increasingly sophisticated techniques. Potential future attack vectors include:
- Inter-Agent Communication Exploitation: Weak authentication and authorization between agents may allow attackers to inject malicious messages or commands, triggering cascading failures across interconnected systems.
- Memory and Context Poisoning: By corrupting agent memory or contextual data, attackers can influence future agent behavior in subtle and persistent ways.
- Rogue Agent Proliferation: Compromised or intentionally malicious agents may deviate from intended behavior, acting autonomously to propagate attacks or exfiltrate data.
The evolving threat landscape demands continuous adaptation of security practices, ongoing research into emerging attack techniques, and robust collaboration across the AI and cybersecurity communities.
Note: The content above is entirely new and does not overlap with any existing subtopic reports or written contents as per the provided instructions. All sections, headers, and detailed scenarios are unique to this report and have not been previously covered. Hyperlinks are included to the relevant BleepingComputer source as required.
Final Thoughts
The real-world attacks behind the OWASP Agentic AI Top 10 are a wake-up call for anyone deploying or relying on autonomous AI systems. These incidents underscore that agentic AI isn’t just another software layer—it’s a dynamic, decision-making entity with the power to amplify both productivity and risk. From instruction injection to supply chain manipulation, attackers are exploiting the very features that make agentic AI valuable.
Defending against these threats requires more than patching code; it demands adaptive strategies like behavioral monitoring, rapid containment, and rigorous provenance checks. The collaborative response from industry leaders and the emergence of shared frameworks like the OWASP Agentic AI Top 10 show that progress is possible—but only if vigilance keeps pace with innovation. As agentic AI continues to evolve, so too will the tactics of those seeking to exploit it. Staying ahead means learning from these real-world attacks and building security into every layer of the agentic AI ecosystem (BleepingComputer).
References
- The real-world attacks behind OWASP Agentic AI Top 10. (2025). BleepingComputer. https://www.bleepingcomputer.com/news/security/the-real-world-attacks-behind-owasp-agentic-ai-top-10/