Résumé
Langflow: BaseFileComponent-based nodes arbitrary file read with RCE exploit
Détails de l’avis
Summary
All components based on BaseFileComponent are vulnerable to the following vulnerability:
- Docling (
DoclingInlineComponent) - Docling Serve (
DoclingRemoteComponent) - Read File (
FileComponent) - NVIDIA Retriever Extraction (
NvidiaIngestComponent) - Video File (
VideoFileComponent) - Unstructured API (
UnstructuredComponent)
For clarity, from now on I'll only refer to Read File component.
The Read File node processes user-controlled files. Example scenario is a RAG chatbot - a system that allows users of an organization to ask questions about documents saved in the organizations.
By controlling a files that are digested into the RAG, an attacker can direct the node to read any file on the file-system by absolute path.
Using this vulnerability an attacker can acheive RCE:
- Upload a file that directs the node to read Langflow's
secret_keyfile containing the JWT token secret. - This would allow the attacker then to simply task the Chatbot for the JWT secret.
- Using this secret, the attacker then crafts a JWT token for any user-id, bypassing authentication.
- Code execution is then trivial - simply create a new flow with "Python Interpreter" node, fill it with arbitrary Python code and execute it.
Tested on commit 2d67402b1dbaefcbce85a244d4a6cd5e4bda1cfe
Details
The vulnerability is in:
langflow/src/lfx/src/lfx/base/data/base_file.py
Specifically in _unpack_bundle. This function extracts tar files, which can contain a symlink.
This symlink can point to any file in the filesystem. Then, in self.process_files(), the file pointed by the symlink will be parsed and saved into the RAG.
This can be done with unlimited number of symlinks in the same tar which can also be useful in some scenarios.
Suggestd fix - iterate over the files and make sure all are regular files or directories.
PoC
Reproduction:
- Create a flow with Read File (or any other affected components), and connect its output to some storage such as Chroma DB.
- Create a symlink pointing to any file. For the above exploit, point the symlink to langflow's JWT token file.
- Compress this symlink with tar.
- Upload it to the Read File component.
- Check the database, or ask a Chatbot connected to this vector database for the contents of the file.
Concrete PoC:
- Flow with RAG ingestion and a Chatbot around it: Vector Store RAG.json
- Exploit tar: archive.tar.txt (remove .txt, GitHub blocked .tar)
- Create a file
/tmp/trip.docxwith any contents in it - Ingest the file in the flow above, and ask the Chatbot a question about this file.
A demo showing the attack:
https://github.com/user-attachments/assets/af00f700-f13f-4eac-848e-8afd11fb9297
In the demo the attacker steals Langflow secret key used to sign JWTs. The second stage of the attack, not shown in the demo, is using this key to sign a JWT token and executing Python code on the server using the Python code interpreter node.
Impact
Any Langflow user using any of the above mentioned components to ingest user-controlled data is affected. Depending on exact scenario, the user can also be exposed to an RCE risk.
Patches
Fixed in 1.9.2 via PR #12945. BaseFileComponent._unpack_bundle now rejects symlink and hardlink members (and any non-regular entries) during TAR extraction, with additional defensive symlink filtering during directory recursion and after extraction. Upgrade to 1.9.2 or later.
Ori Lahav Security Researcher @ Rubrik Inc.
Références
Vulnérabilités liées
Tout Supply chain →- MEDIUMCVE-2026-41579
runc: Malicious image with /dev symlink can trigger limited host filesystem integrity violations
- MEDIUMGHSA-h4h3-3rfj-x6fq
SurrealDB: Indexed ORDER BY leaks the value ordering of a SELECT-restricted field
- MEDIUMGHSA-wvrh-2f4m-924v
ChatterBot: Symlink-Following Arbitrary Write via UbuntuCorpusTrainer
- HIGHCVE-2026-22555
Gitea before 1.26.0 is missing a `CanCreateOrgRepo` permission check on its fork API (CVE-2026-22555). A user without permission to create repositories in an organization could fork into it and, in doing so, exfiltrate the organization's secrets. It is a broken-authorization flaw that leaks organization and CI/CD secrets to users who should not have access to them.
- LOWGHSA-g7r4-m6w7-qqqr
esbuild's development server (versions 0.27.3 up to but not including 0.28.1) allows arbitrary file read on Windows: a crafted request path can escape the served directory and read files elsewhere on disk. It affects the development server only, not production builds, but anyone running `esbuild --serve` on Windows is exposed to any local or networked attacker who can reach the server.
- CRITICALCVE-2024-23897
CVE-2024-23897 was a critical arbitrary file read vulnerability in the Jenkins automation server, identified by Sonar's Vulnerability Research and disclosed in the Jenkins security advisory on January 24, 2024, affecting Jenkins weekly up to 2.441 and LTS up to 2.426.2. Jenkins parses built-in CLI command arguments with the args4j library, whose expandAtFiles feature is enabled by default and replaces an argument that begins with an @ character followed by a file path with the contents of that file; because Jenkins never disabled this, an attacker could pass @/path/to/file as a CLI argument to make the controller read and disclose files from its filesystem. Unauthenticated attackers could read the first few lines of arbitrary files, while attackers with Overall/Read permission could read entire files, enabling theft of secrets, SSH keys, and credentials. The leaked binary secret keys could then be chained into full remote code execution by forging Remember-me cookies, abusing Resource Root URLs, bypassing CSRF protection, or decrypting stored secrets. The flaw was added to the CISA KEV catalog on August 19, 2024 and was actively exploited, including by the RansomEXX ransomware gang and the actor IntelBroker, and was linked to breaches at BORN Group and Brontoo Technology Solutions.