Critical vLLM Vulnerability Allows Remote Code Execution via Malicious Video Files
A critical vulnerability chain has been disclosed in vLLM, a widely-used high-performance library for Large Language Model inference. Tracked as CVE-2026-22778 with a CVSS score of 9.8, the flaw allows remote attackers to execute arbitrary commands on vLLM servers by submitting a malicious video URL to the API.
Default vLLM installations have no authentication, meaning zero privileges are required to exploit this vulnerability. Even with API key authentication enabled, the exploit is feasible through the invocations route which allows payload execution pre-authentication.
Deployments not serving video models are not affected.
The Exploit Chain
The vulnerability combines two weaknesses to achieve reliable remote code execution.
Vulnerability 1: PIL Address Leak (ASLR Bypass)
When an invalid image is sent to vLLM's multimodal endpoint, PIL throws an error message that includes a heap memory address:
cannot identify image file <_io.BytesIO object at 0x7a95e299e750>vLLM returns this error to the client, leaking an address approximately 10.33 GB before libc in memory. This reduces ASLR from 4 billion possible guesses to approximately 8, effectively bypassing address space layout randomization.
Vulnerability 2: JPEG2000 Heap Overflow (RCE)
vLLM uses OpenCV for video decoding, which bundles FFmpeg 5.1.x containing a heap overflow in the JPEG2000 decoder. The JPEG2000 format includes a cdef box that can remap color channels. By remapping the Y (luma) channel into the U (chroma) buffer, attackers trigger a heap overflow — the Y plane is four times larger than the U plane buffer it's written into.
For a 150×64 pixel image, this produces a 7,200-byte overflow past the U buffer allocation. Larger images produce proportionally larger overflows.
The overflow overwrites an AVBuffer structure containing a free() function pointer. Attackers set this pointer to system() with an opaque parameter containing their command string. When the buffer is freed, arbitrary commands execute on the server.
Attack Flow
The complete attack chain works as follows:
- Attacker sends request with
video_urlpointing to a malicious .mov file - vLLM fetches the video from the URL
- vLLM passes video bytes to
cv2.VideoCapture() - OpenCV's bundled FFmpeg decodes JPEG2000 frames
- Malicious
cdefbox triggers heap overflow - AVBuffer.free pointer overwritten with
system() - When buffer is released,
system("attacker command")executes
Affected Versions and Endpoints
Affected versions: vLLM >= 0.8.3, < 0.14.1
Vulnerable endpoints:
POST /v1/chat/completions(with video_url in content)POST /v1/invocations(with video_url in content)
Affected components:
- OpenCV 4.x with bundled FFmpeg
- FFmpeg 5.1.x (bundled in OpenCV)
- libopenjp2 2.x
Remediation
Upgrade to vLLM version 0.14.1 or later immediately. Organizations running vLLM with video model support in production should treat this as an emergency patch.
If immediate patching is not possible, disable video model endpoints or implement network-level restrictions to limit API access to trusted sources. However, these mitigations are incomplete — upgrading is the only complete fix.
Organizations should also audit logs for unusual video URL submissions to multimodal endpoints, particularly URLs pointing to external or untrusted sources.