CRITICALAI/LLMexploited in the wild

AI-MCP-TOOL-POISONING-2025

MCP · MCP tool poisoning

Résumé

MCP tool poisoning is a supply-chain prompt-injection class in which a malicious Model Context Protocol server embeds hidden directives inside a tool's description metadata. Because MCP clients feed the full tool description into the model's context but typically render only a simplified tool name to the user, the model reads attacker instructions (often wrapped in tags like IMPORTANT) that the human never sees. Invariant Labs disclosed this on April 1, 2025, demonstrating that merely connecting a server lets a benign-looking add() tool silently instruct the agent to read files such as ~/.cursor/mcp.json and ~/.ssh/id_rsa and exfiltrate them through innocuous-seeming parameters; this also enables 'line jumping' (Trail of Bits), where the description influences the model before any tool is invoked, and 'rug pull' variants that mutate a tool's description after the user has already approved it. The class maps to OWASP LLM01:2025 Prompt Injection and the LLM03 supply-chain risk.

Comment l’éviter dans votre code

Display full tool descriptions to users, clearly separating user-visible from AI-visible instructions.
Pin tool definitions with hashes or checksums and reject post-approval description changes.
Vet and sandbox third-party MCP servers; restrict file and credential access from agent processes.
Enforce cross-server boundaries and guardrails so one server cannot influence another's tools.
Scan MCP servers for injection payloads before connecting (e.g. MCP-Scan).

Références

Vulnérabilités liées

Tout AI/LLM →