medium

CVE-2025-46570

PyPI · vllm

Summary

vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing difference

Severity
medium
EPSS
0.2% (p16)
Also known as
GHSA-4qjh-9fv9-r85r, PYSEC-2025-53
Published
2025-05-29

References

Related advisories

Is your project exposed to this? Stateward checks every dependency on every pull request and flags it only if your code actually reaches it.

Check my repo

Summarize with AI

ChatGPTClaudePerplexity

Sources: CISA KEV (public domain), OSV.dev & GitHub Advisory Database (CC-BY-4.0), FIRST EPSS, NVD/CWE (public domain). Served live from the Stateward advisory database.