What is GHSA-8jr5-v98p-w75m?

GHSA-8jr5-v98p-w75m is a security advisory published by GitHub Security Advisories with Medium severity. vLLM: image EXIF Rotation \u0026 PNG tRNS Transparency Not Normalized, Causing Mismatch Between Model Input and Expectations

Which CVE IDs does GHSA-8jr5-v98p-w75m cover?

GHSA-8jr5-v98p-w75m covers the following CVE IDs: CVE-2026-12491. Each CVE has its own detail page on EchelonGraph Pulse with the synthesized EG score.

What is the CVSS v3 score of GHSA-8jr5-v98p-w75m?

GHSA-8jr5-v98p-w75m has a CVSS v3 base score of 4.8, published by GitHub Security Advisories.

Which products are affected by GHSA-8jr5-v98p-w75m?

GHSA-8jr5-v98p-w75m affects 1 product, including: pip/vllm:\u003e= 0.11.0, \u003c= 0.23.0.

GHSA-8jr5-v98p-w75mMediumCVSS 4.8

vLLM: image EXIF Rotation & PNG tRNS Transparency Not Normalized, Causing Mismatch Between Model Input and Expectations

Vendor

GitHub Security Advisories

Published

June 17, 2026

Last Modified

June 18, 2026

🔗 CVE IDs covered (1)

CVE-2026-12491 →

📋 Description

Summary

Issue 1: EXIF orientation not normalized → The image orientation processed by the model differs from how humans view it, introducing interpretation bias.

Issue 2: PNG tRNS not explicitly flattened before converting to RGB → After conversion, transparent/semi-transparent pixels are rendered unexpectedly, making otherwise subtle overlay elements visible and distorting the input content. (This attack is similar to AlphaDog: RGBA handling is already correct in vLLM, but since tRNS permits RGB images, the correct processing path isn’t taken.)

Issue 3 : Pillow only loads the first frame when loading APNG or GIF files.

Root Cause

Rotation: After opening an image, ImageOps.exif_transpose is not called to normalize EXIF orientation.
Transparency: Only RGBA→RGB is flattened with a background; PNGs carrying tRNS in P/L/RGB + tRNS and other non-RGBA modes take the image.convert("RGB") path, which implicitly discards/remaps transparency semantics.

Affected Code

https://github.com/vllm-project/vllm/blob/16b37f3119918c1e5a39f303e0d0892c65c07a90/vllm/multimodal/image.py#L77-L84

https://github.com/vllm-project/vllm/blob/16b37f3119918c1e5a39f303e0d0892c65c07a90/vllm/multimodal/image.py#L37-L43

https://github.com/vllm-project/vllm/blob/16b37f3119918c1e5a39f303e0d0892c65c07a90/vllm/multimodal/image.py#L26-L34

Current state: ImageOps.exif_transpose is not used. (Although the rescale_image_size function (https://github.com/vllm-project/vllm/blob/main/vllm/multimodal/image.py#L14) exists and includes a transpose parameter, I’ve found that it doesn’t seem to be called anywhere outside the test directory.）

Call order: _convert_image_mode runs first; if the conditions are met, convert_image_mode is called.

Issue: Only the “RGBA → RGB” path is explicitly flattened. P, L, or RGB with tRNS all fall back to image.convert("RGB"). For PNGs that include tRNS, convert("RGB") directly produces 24-bit RGB, leading to:

P mode: The transparent index becomes an actual RGB color (often black, white, or an undefined background), so transparency is lost.

L/LA and RGB + tRNS: convert("RGB") doesn’t composite against a chosen background first, so elements that relied on transparency to be hidden or softened become solid.

Impact & Scope

Impact: Pixels the model sees can diverge from operator expectations (due to orientation or transparency handling), potentially altering downstream reasoning.
Scope: The image I/O and mode-conversion paths in vllm/multimodal/image.py. The existing RGBA→RGB flattening is correct; the issues center on missing EXIF normalization and non-RGBA tRNS not being explicitly composited.

Case

EXIF： http://qiniu.funxingzuo.top/exif_orient_180.jpg tRNS: http://qiniu.funxingzuo.top/hello.png

Fix

A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/44974

🎯 Affected products1

pip/vllm:>= 0.11.0, <= 0.23.0