Summary
MCP tool poisoning is a supply-chain prompt-injection class in which a malicious Model Context Protocol server embeds hidden directives inside a tool's description metadata. Because MCP clients feed the full tool description into the model's context but typically render only a simplified tool name to the user, the model reads attacker instructions (often wrapped in tags like IMPORTANT) that the human never sees. Invariant Labs disclosed this on April 1, 2025, demonstrating that merely connecting a server lets a benign-looking add() tool silently instruct the agent to read files such as ~/.cursor/mcp.json and ~/.ssh/id_rsa and exfiltrate them through innocuous-seeming parameters; this also enables 'line jumping' (Trail of Bits), where the description influences the model before any tool is invoked, and 'rug pull' variants that mutate a tool's description after the user has already approved it. The class maps to OWASP LLM01:2025 Prompt Injection and the LLM03 supply-chain risk.
How to avoid it in your code
- Display full tool descriptions to users, clearly separating user-visible from AI-visible instructions.
- Pin tool definitions with hashes or checksums and reject post-approval description changes.
- Vet and sandbox third-party MCP servers; restrict file and credential access from agent processes.
- Enforce cross-server boundaries and guardrails so one server cannot influence another's tools.
- Scan MCP servers for injection payloads before connecting (e.g. MCP-Scan).
References
Related vulnerabilities
All AI/LLM →- CRITICALAI-GROK-BANKR-WALLET-2026
In early May 2026 an attacker drained roughly $150,000 from an AI-powered crypto trading agent on X (Twitter) through prompt injection, an exploit of Grok and the linked Bankrbot agent documented by AI-security researchers including Giskard and NeuralTrust. The attacker posted a Morse-code-encoded message on X and asked Grok to translate it; Grok decoded the obfuscated payload, which contained hidden financial instructions, and the encoding let the untrusted post slip past content filters. Grok processed this user-supplied X content as a trusted directive with no separation between conversation input and authorized commands, then relayed the decoded instruction to the linked Bankrbot agent, which executed it as a legitimate order. Combined with a previously transferred Bankr Club Membership NFT that granted elevated 'Executive' wallet permissions, Bankrbot sent about 3 billion DRB tokens (roughly $150,000) on the Base network to the attacker's wallet, with no human-in-the-loop or circuit breaker on the high-value transfer. About 80% of the funds were later returned after the community identified the attacker.
- CRITICALAI-FORCEDLEAK-AGENTFORCE-2025
Disclosed on September 25, 2025 by Noma Security, ForcedLeak is a CVSS 9.4 indirect prompt-injection chain in Salesforce Agentforce affecting organizations with Web-to-Lead enabled. An attacker submits a public Web-to-Lead form and plants hidden instructions in the Description field, chosen because its roughly 42,000-character limit allows complex multi-step directives. When an employee later asks the Agentforce AI agent to process or summarize that lead, the agent ingests the attacker-controlled text as part of its context and executes the embedded commands, querying and reading internal CRM data such as lead email addresses and other contact and sales-pipeline information. The agent then exfiltrates the harvested data by embedding it in an image or link request to an expired Salesforce-related domain that remained on the Content Security Policy allow-list and was re-registered by researchers for about $5, bypassing egress controls. Salesforce remediated it on September 8, 2025 by re-securing the expired domain and enforcing Trusted URLs for Agentforce and Einstein AI; no CVE was assigned because the issue did not stem from a software version flaw.
- HIGHAI-LENOVO-LENA-XSS-2025
In 2025 Cybernews researchers disclosed that Lenovo's GPT-4-based customer-service chatbot 'Lena' could be turned into a cross-site scripting vector through a single prompt injection. A roughly 400-character prompt opened with a normal product question, then instructed the bot to format its reply as HTML and to include an image tag whose source pointed at an attacker-controlled server, insisting the image must be shown. Because the chatbot's output was rendered in the browser without sanitization or output encoding, the untrusted instruction flowed straight into live HTML, and the forced image request caused the victim's browser to call the attacker server and leak active session cookies. The impact extended to support staff: when a chat was escalated, the human agent's workstation rendered the stored malicious HTML, exposing the agent's session and enabling potential session hijacking, redirects, or malware prompts. Cybernews reported finding the flaw on July 22, 2025; Lenovo acknowledged it on August 6, 2025 and deployed fixes by August 18, 2025. The root cause was treating model output as trusted markup and rendering it without filtering.
- HIGHAI-GEMINI-INVITATION-PROMPTWARE-2025
Presented at Black Hat USA 2025 and DEF CON 33 and published August 6, 2025 by SafeBreach researchers Ben Nassi, Stav Cohen and Or Yair, this indirect prompt injection (dubbed 'promptware') hijacks Google Gemini through poisoned Google Calendar invites, emails and shared documents. An attacker sends the victim a calendar invite whose title contains hidden instructions; the malicious text sits unnoticed because long event lists hide entries behind a 'Show more' control yet still enter Gemini's context. When the victim later asks Gemini a routine request such as summarizing their schedule, the agent ingests the attacker's calendar data as trusted context and executes the embedded directives, abusing Gemini's connected agents and tool permissions. Demonstrated real-world effects included controlling Google Home smart devices to open windows, turn off lights and activate a boiler, plus geolocating the victim, starting a Zoom video stream, deleting calendar events and exfiltrating email content. The researchers privately disclosed to Google in February 2025, and Google deployed layered mitigations including user confirmations, URL sanitization and prompt-injection detection before publication.
- HIGHAI-CURSOR-MCPOISON-2025
MCPoison (CVE-2025-54136), disclosed by Check Point Research and published August 1, 2025, is a persistent remote-code-execution flaw in the Cursor AI code editor affecting versions 1.2.4 and below, rated CVSS 8.8 by NIST. The root cause is that Cursor binds trust for a Model Context Protocol server to its configuration entry's name rather than to the content of its command, so once a collaborator approves an MCP entry, later edits to that entry's underlying command are treated as already trusted and run without any re-prompt. An attacker who can edit a shared .cursor/mcp.json in a repository, or the file locally, first commits a benign MCP entry to obtain approval, then silently swaps its command for a malicious one; the payload then executes automatically every time the victim opens the project, giving durable code execution on the developer's machine. This makes shared repositories a software-supply-chain vector for IP theft and host compromise. It is distinct from CurXecute (CVE-2025-54135), which uses live prompt injection to rewrite mcp.json; MCPoison abuses trust-by-name persistence after legitimate approval. Cursor fixed it in version 1.3 by re-validating modified MCP configurations.
- HIGHCVE-2025-54135
Aim Labs disclosed CurXecute (CVE-2025-54135, CVSS 8.6), a remote-code-execution flaw in the Cursor AI code editor reachable through prompt injection. Because Cursor runs with developer-level privileges and supports the Model Context Protocol, untrusted external data pulled in by an MCP server (for example a crafted Slack message) can redirect the agent's control flow and rewrite the global mcp.json configuration to execute arbitrary commands. Potential consequences include data exfiltration, ransomware deployment, and dependency-poisoning; it was patched in Cursor 1.3 on July 29, 2025.