vllm

PyPI56 known CVEs affecting this package

Aggregated from OSV, GitHub Security Advisories, NVD, and vendor advisories. Each CVE links to its full detail page with vendor advisories, patches, fixed versions, and remediation guidance.

CVEs affecting vllmpage 1 of 2

CVE-2024-11041CRITICALCVSS 9.8EG 9.8
2025-03-20
vulnerable: 0.0.1 ... 0.6.2 (40 versions)
vllm-project vllm version v0.6.2 contains a vulnerability in the MessageQueue.dequeue() API function. The function uses pickle.loads to parse received sockets directly, leading to a remote code execution vulnerability. An attacker can expl…
CVE-2024-8768HIGHCVSS 7.5EG 7.5✓ Fixed in 0.5.5
2024-09-17
vulnerable: 0.0.1 ... 0.5.4 (34 versions)
A flaw was found in the vLLM library. A completions API request with an empty prompt will crash the vLLM API server, resulting in a denial of service.
CVE-2024-8939MEDIUMCVSS 6.2EG 6.2
2024-09-17
vulnerable: 0.0.1 ... 0.5.0.post1 (29 versions)
A vulnerability was found in the ilab model serve component, where improper handling of the best_of parameter in the vllm JSON web API can lead to a Denial of Service (DoS). The API used for LLM-based sentence or chat completion accepts a …
CVE-2024-9053CRITICALCVSS 9.8EG 9.8
2025-03-20
vulnerable: 0.0.1 ... 0.6.0 (36 versions)
vllm-project vllm version 0.6.0 contains a vulnerability in the AsyncEngineRPCServer() RPC server entrypoints. The core functionality run_server_loop() calls the function _make_handler_coro(), which directly uses cloudpickle.loads() on rec…
CVE-2025-24357HIGHCVSS 7.5EG 7.5✓ Fixed in 0.7.0
2025-01-27
vulnerable: 0.0.1 ... 0.6.6.post1 (47 versions)
vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_o…
CVE-2025-25183LOWCVSS 2.6EG 2.6✓ Fixed in 0.7.2
2025-02-07
vulnerable: 0.0.1 ... 0.7.1 (49 versions)
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause uninten…
CVE-2025-29770MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.8.0
2025-03-19
vulnerable: 0.0.1 ... 0.7.3 (51 versions)
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for i…
CVE-2025-29783CRITICALCVSS 9.0EG 9.0✓ Fixed in 0.8.0
2025-03-19
vulnerable: 0.6.5 ... 0.7.3 (7 versions)
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute r…
CVE-2025-30165HIGHCVSS 8.0EG 8.0✓ Fixed in 0.10.0
2025-05-06
vulnerable: 0.5.2 ... 0.9.2 (32 versions)
vLLM is an inference and serving engine for large language models. In a multi-node vLLM deployment using the V0 engine, vLLM uses ZeroMQ for some multi-node communication purposes. The secondary vLLM hosts open a `SUB` ZeroMQ socket and co…
CVE-2025-30202HIGHCVSS 7.5EG 7.5✓ Fixed in 0.8.5
2025-04-30
vulnerable: 0.5.2 ... 0.8.4 (26 versions)
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a mult…
CVE-2025-32444CRITICALCVSS 10.0EG 10.0✓ Fixed in 0.8.5
2025-04-30
vulnerable: 0.6.5 ... submission (68 versions)
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based…
CVE-2025-46560MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.8.5
2025-04-30
vulnerable: 0.8.0, 0.8.1, 0.8.2, 0.8.3, 0.8.4
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal t…
CVE-2025-46570LOWCVSS 2.6EG 2.6✓ Fixed in 0.9.0
2025-05-29
vulnerable: 0.0.1 ... 0.8.5.post1 (58 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflect…
CVE-2025-46722MEDIUMCVSS 4.2EG 4.2✓ Fixed in 0.9.0
2025-05-29
vulnerable: 0.0.1 ... 0.8.5.post1 (58 versions)
vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its ima…
CVE-2025-47277CRITICALCVSS 9.8EG 9.8✓ Fixed in 0.8.5
2025-05-20
vulnerable: 0.6.5 ... 0.8.4 (12 versions)
vLLM, an inference and serving engine for large language models (LLMs), has an issue in versions 0.6.5 through 0.8.4 that ONLY impacts environments using the `PyNcclPipe` KV cache transfer integration with the V0 engine. No other configura…
CVE-2025-48887MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.9.0
2025-05-30
vulnerable: 0.6.4 ... submission (72 versions)
vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Denial of Service (ReDoS) vulnerability in the file `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` of versions 0.6.4 up to but…
CVE-2025-48942MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.9.0
2025-05-30
vulnerable: 0.8.0 ... 0.8.5.post1 (7 versions)
vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability …
CVE-2025-48943MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.9.0
2025-05-30
vulnerable: 0.8.0 ... 0.8.5.post1 (7 versions)
vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured…
CVE-2025-48944MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.9.0
2025-05-30
vulnerable: 0.8.0 ... 0.8.5.post1 (7 versions)
vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in t…
CVE-2025-48956HIGHCVSS 7.5EG 7.5✓ Fixed in 0.10.1.1
2025-08-21
vulnerable: 0.1.0 ... 0.9.2 (63 versions)
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTT…
CVE-2025-59425HIGHCVSS 7.5EG 7.5✓ Fixed in 0.11.0
2025-10-07
vulnerable: 0.0.1 ... 0.9.2 (66 versions)
vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string com…
CVE-2025-62164HIGHCVSS 8.8EG 8.8✓ Fixed in 0.11.1
2025-11-21
vulnerable: 0.10.2, 0.11.0
vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exist…
CVE-2025-62372MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.11.1
2025-11-21
vulnerable: 0.10.0 ... 0.9.2 (33 versions)
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect…
CVE-2025-6242HIGHCVSS 7.1EG 7.1✓ Fixed in 0.11.0
2025-10-07
vulnerable: 0.10.0 ... 0.9.2 (39 versions)
A Server-Side Request Forgery (SSRF) vulnerability exists in the MediaConnector class within the vLLM project's multimodal feature set. The load_from_url and load_from_url_async methods fetch and process media from user-provided URLs witho…
CVE-2025-62426MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.11.1
2025-11-21
vulnerable: 0.10.0 ... 0.9.2 (33 versions)
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before …
CVE-2025-66448HIGHCVSS 8.8EG 8.8✓ Fixed in 0.11.1
2025-12-01
vulnerable: 0.0.1 ... 0.9.2 (67 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.11.1, vllm has a critical remote code execution vector in a config class named Nemotron_Nano_VL_Config. When vllm loads a model config that contains an au…
CVE-2025-71379HIGHCVSS 7.5EG 7.5✓ Fixed in 0.9.0
2026-06-20
vulnerable: 0.6.3 ... 0.8.5.post1 (18 versions)
vLLM versions >= 0.6.3 and < 0.9.0 contain multiple regular expression denial of service (ReDoS) vulnerabilities. Several regex patterns — in vllm/lora/utils.py, the phi4mini tool parser, and the OpenAI-compatible serving chat endpoint �…
CVE-2026-12491MEDIUMCVSS 4.8EG 4.8
2026-06-17
vulnerable: 0.11.0 ... 0.23.0 (23 versions)
A flaw was found in vLLM, an open-source library for large language model inference. This vulnerability arises from improper handling of image metadata, specifically EXIF orientation and PNG transparency (tRNS) data, during image processin…
CVE-2026-22773MEDIUMCVSS 7.5EG 6.5✓ Fixed in 0.12.0
2026-01-10
vulnerable: 0.10.0 ... 0.9.2 (27 versions)
vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to before 0.12.0, users can crash the vLLM engine serving multimodal models that use the Idefics3 vision model implementation by sending a spe…
CVE-2026-22778CRITICALCVSS 9.8EG 9.8✓ Fixed in 0.14.1
2026-02-02
vulnerable: 0.10.0 ... 0.9.2 (18 versions)
vLLM is an inference and serving engine for large language models (LLMs). From 0.8.3 to before 0.14.1, when an invalid image is sent to vLLM's multimodal endpoint, PIL throws an error. vLLM returns this error to the client, leaking a heap …
CVE-2026-22807CRITICALCVSS 9.8EG 9.8✓ Fixed in 0.14.0
2026-01-21
vulnerable: 0.10.1 ... 0.13.0 (8 versions)
vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.14.0, vLLM loads Hugging Face `auto_map` dynamic modules during model resolution without gating on `trust_remote_co…
CVE-2026-24779HIGHCVSS 7.1EG 7.1✓ Fixed in 0.14.1
2026-01-27
vulnerable: 0.0.1 ... 0.9.2 (72 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.14.1, a Server-Side Request Forgery (SSRF) vulnerability exists in the `MediaConnector` class within the vLLM project's multimodal feature set. Th…
CVE-2026-25960CRITICALCVSS 9.8EG 9.8✓ Fixed in 0.17.0
2026-03-09
vulnerable: 0.15.1, 0.16.0
vLLM is an inference and serving engine for large language models (LLMs). The SSRF protection fix for CVE-2026-24779 add in 0.15.1 can be bypassed in the load_from_url_async method due to inconsistent URL parsing behavior between the valid…
CVE-2026-27893HIGHCVSS 8.8EG 8.8✓ Fixed in 0.18.0
2026-03-27
vulnerable: 0.10.1 ... 0.17.1 (15 versions)
vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.18.0, two model implementation files hardcode `trust_remote_code=True` when loading sub-components, bypassing the u…
CVE-2026-34753MEDIUMCVSS 5.4EG 5.4✓ Fixed in 0.19.0
2026-04-06
vulnerable: 0.16.0, 0.17.0, 0.17.1, 0.18.0, 0.18.1
vLLM is an inference and serving engine for large language models (LLMs). From 0.16.0 to before 0.19.0, a server-side request forgery (SSRF) vulnerability in download_bytes_from_url allows any actor who can control batch input JSON to make…
CVE-2026-34755MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.19.0
2026-04-06
vulnerable: 0.10.0 ... 0.9.2 (33 versions)
vLLM is an inference and serving engine for large language models (LLMs). From 0.7.0 to before 0.19.0, the VideoMediaIO.load_base64() method at vllm/multimodal/media/video.py splits video/jpeg data URLs by comma to extract individual JPEG …
CVE-2026-34756MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.19.0
2026-04-06
vulnerable: 0.1.0 ... 0.9.2 (79 versions)
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n…
CVE-2026-34760HIGHCVSS 7.1EG 7.1✓ Fixed in 0.18.0
2026-04-02
vulnerable: 0.10.0 ... 0.9.2 (44 versions)
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 spec…
CVE-2026-41523HIGHCVSS 7.5EG 7.5✓ Fixed in 0.22.0
2026-06-16
vulnerable: 0.0.1 ... 0.9.2 (86 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.22.0, an assert-based security check in vLLM's activation function loading allows any unauthenticated attacker to achieve arbitrary code execution on the …
CVE-2026-44222MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.20.0
2026-05-12
vulnerable: 0.10.0 ... 0.9.2 (46 versions)
vLLM is an inference and serving engine for large language models (LLMs). From 0.6.1 to before 0.20.0, there is a a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special toke…
CVE-2026-44223MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.20.0
2026-05-12
vulnerable: 0.18.0, 0.18.1, 0.19.0, 0.19.1
vLLM is an inference and serving engine for large language models (LLMs). From 0.18.0 to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, c…
CVE-2026-47155MEDIUMCVSS 6.5EG 6.5✓ Fixed in 0.22.0
2026-06-10
vulnerable: 0.0.1 ... 0.9.2 (86 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.22.0, vLLM's revision pinning controls do not consistently apply to all artifacts loaded for a model. A deployment that supplies --revision or --code-revi…
CVE-2026-48746CRITICALCVSS 9.1EG 9.1✓ Fixed in 0.22.0
2026-06-16
vulnerable: 0.10.0 ... 0.9.2 (68 versions)
vLLM is an inference and serving engine for large language models (LLMs). From 0.3.0 until 0.22.0, a vulnerability in ASGI web servers and starlette's trust on those web servers enables an authentication bypass of the OpenAI API Authentica…
CVE-2026-53923HIGHCVSS 7.5EG 7.5
2026-06-17
vulnerable: 0.10.0 ... 0.9.2 (55 versions)
vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor p…
CVE-2026-54232HIGHCVSS 8.8EG 8.8✓ Fixed in 0.22.1
2026-06-22
vulnerable: 0.0.1 ... 0.9.2 (87 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.22.1, the vLLM Dockerfile is vulnerable to a dependency confusion attack through the flashinfer-jit-cache package. The package is installed from a custom …
CVE-2026-54233MEDIUMCVSS 6.5EG 6.5
2026-06-17
vulnerable: 0.0.1 ... 0.9.2 (89 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32…
CVE-2026-54234HIGHCVSS 7.5EG 7.5✓ Fixed in 0.24.0
2026-07-06
vulnerable: 0.17.1 ... 0.23.0 (12 versions)
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Prior to 0.24.0, a frontend-legal multi-request speculative decoding workload can cause the rejection sampler to produce a recovered token equal to the m…
CVE-2026-54235MEDIUMCVSS 6.5EG 6.5
2026-06-17
vulnerable: 0.0.1 ... 0.9.2 (89 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, ll temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's I…
CVE-2026-54236MEDIUMCVSS 5.3EG 5.3
2026-06-17
vulnerable: 0.0.1 ... 0.9.2 (89 versions)
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, the fix for CVE-2026-22778, which introduced a sanitize_message helper that strips object-repr memory addresses from error messages before they r…
CVE-2026-5497HIGHCVSS 7.5EG 7.5✓ Fixed in 0.19.0
2026-06-11
vulnerable: 0.10.0 ... 0.9.2 (29 versions)
vLLM versions 0.8.0 and later are vulnerable to an Out-of-Memory (OOM) Denial of Service (DoS) attack due to unbounded frame count processing in the `VideoMediaIO.load_base64()` method. When processing `video/jpeg` data URLs, the method sp…

Check whether vllm is used in your infrastructure

EchelonGraph scans your cloud and SBOMs to map every package to your actual deployments. See blast radius for vllm CVEs against the assets you own.

Start Free Scan →