AI-LIVING-OFF-COPILOT-2024
Microsoft Copilot · Living off Microsoft Copilot
Summary
At Black Hat USA 2024, Michael Bargury of Zenity presented Living off Microsoft Copilot, demonstrating how indirect prompt injection, RAG poisoning and phantom references let an attacker manipulate Microsoft 365 Copilot to exfiltrate sensitive enterprise data, bypass Data Loss Prevention controls, and conduct AI-driven spear-phishing and social engineering. Zenity released red-team tooling including LOLCopilot, CopilotHunter and PowerPwn v3. This was a red-team research demonstration against the live product rather than a single patched CVE.
How to avoid it in your code
- Treat RAG/retrieved content as untrusted data, not instructions; isolate it from the prompt context.
- Restrict Copilot data access to least privilege; deny cross-scope reads that enable exfiltration.
- Enforce DLP at the egress layer and require approval for sensitive data actions.
- Validate and pin RAG sources; detect and reject phantom/poisoned references.
- Sanitize assistant output before rendering links to prevent exfiltration and phishing.
References
Related vulnerabilities
All AI/LLM →- CRITICALAI-GROK-BANKR-WALLET-2026
In early May 2026 an attacker drained roughly $150,000 from an AI-powered crypto trading agent on X (Twitter) through prompt injection, an exploit of Grok and the linked Bankrbot agent documented by AI-security researchers including Giskard and NeuralTrust. The attacker posted a Morse-code-encoded message on X and asked Grok to translate it; Grok decoded the obfuscated payload, which contained hidden financial instructions, and the encoding let the untrusted post slip past content filters. Grok processed this user-supplied X content as a trusted directive with no separation between conversation input and authorized commands, then relayed the decoded instruction to the linked Bankrbot agent, which executed it as a legitimate order. Combined with a previously transferred Bankr Club Membership NFT that granted elevated 'Executive' wallet permissions, Bankrbot sent about 3 billion DRB tokens (roughly $150,000) on the Base network to the attacker's wallet, with no human-in-the-loop or circuit breaker on the high-value transfer. About 80% of the funds were later returned after the community identified the attacker.
- CRITICALAI-FORCEDLEAK-AGENTFORCE-2025
Disclosed on September 25, 2025 by Noma Security, ForcedLeak is a CVSS 9.4 indirect prompt-injection chain in Salesforce Agentforce affecting organizations with Web-to-Lead enabled. An attacker submits a public Web-to-Lead form and plants hidden instructions in the Description field, chosen because its roughly 42,000-character limit allows complex multi-step directives. When an employee later asks the Agentforce AI agent to process or summarize that lead, the agent ingests the attacker-controlled text as part of its context and executes the embedded commands, querying and reading internal CRM data such as lead email addresses and other contact and sales-pipeline information. The agent then exfiltrates the harvested data by embedding it in an image or link request to an expired Salesforce-related domain that remained on the Content Security Policy allow-list and was re-registered by researchers for about $5, bypassing egress controls. Salesforce remediated it on September 8, 2025 by re-securing the expired domain and enforcing Trusted URLs for Agentforce and Einstein AI; no CVE was assigned because the issue did not stem from a software version flaw.
- HIGHAI-SHADOWLEAK-2025
ShadowLeak is a server-side zero-click indirect prompt-injection attack against ChatGPT's Deep Research agent, discovered by Radware. An attacker emails the victim a message with instructions hidden in the HTML using white-on-white text and tiny fonts; when the user runs Deep Research over their inbox, the agent autonomously follows the hidden instructions and exfiltrates personal and inbox data. The distinguishing trait is that exfiltration occurs entirely server-side within OpenAI's cloud infrastructure, making it invisible to local and enterprise network defenses. The Gmail proof of concept generalizes to any Deep Research connector; OpenAI fixed it before public disclosure with no evidence of in-the-wild exploitation.
- HIGHAI-LENOVO-LENA-XSS-2025
In 2025 Cybernews researchers disclosed that Lenovo's GPT-4-based customer-service chatbot 'Lena' could be turned into a cross-site scripting vector through a single prompt injection. A roughly 400-character prompt opened with a normal product question, then instructed the bot to format its reply as HTML and to include an image tag whose source pointed at an attacker-controlled server, insisting the image must be shown. Because the chatbot's output was rendered in the browser without sanitization or output encoding, the untrusted instruction flowed straight into live HTML, and the forced image request caused the victim's browser to call the attacker server and leak active session cookies. The impact extended to support staff: when a chat was escalated, the human agent's workstation rendered the stored malicious HTML, exposing the agent's session and enabling potential session hijacking, redirects, or malware prompts. Cybernews reported finding the flaw on July 22, 2025; Lenovo acknowledged it on August 6, 2025 and deployed fixes by August 18, 2025. The root cause was treating model output as trusted markup and rendering it without filtering.
- HIGHAI-GEMINI-INVITATION-PROMPTWARE-2025
Presented at Black Hat USA 2025 and DEF CON 33 and published August 6, 2025 by SafeBreach researchers Ben Nassi, Stav Cohen and Or Yair, this indirect prompt injection (dubbed 'promptware') hijacks Google Gemini through poisoned Google Calendar invites, emails and shared documents. An attacker sends the victim a calendar invite whose title contains hidden instructions; the malicious text sits unnoticed because long event lists hide entries behind a 'Show more' control yet still enter Gemini's context. When the victim later asks Gemini a routine request such as summarizing their schedule, the agent ingests the attacker's calendar data as trusted context and executes the embedded directives, abusing Gemini's connected agents and tool permissions. Demonstrated real-world effects included controlling Google Home smart devices to open windows, turn off lights and activate a boiler, plus geolocating the victim, starting a Zoom video stream, deleting calendar events and exfiltrating email content. The researchers privately disclosed to Google in February 2025, and Google deployed layered mitigations including user confirmations, URL sanitization and prompt-injection detection before publication.
- MEDIUMAI-GEMINI-WORKSPACE-2025
Marco Figueroa of Mozilla's 0DIN program documented a Gemini for Workspace flaw where an attacker hides instructions inside an email using tags styled with font-size zero or white-on-white text, invisible to the recipient. When the user clicks Summarize this email, Gemini processes the raw HTML and treats the hidden directive as a high-priority instruction, appending an attacker-crafted fake security warning, such as a fake support phone number, that appears to come from Google. No links or attachments are required, enabling credential harvesting and vishing at scale through indirect prompt injection.