CVE-2025-49847

Name	CVE-2025-49847
Description	llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper _try_copy in llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() casts a very large size_t token length into an int32_t, causing the length check (if (length < (int32_t)size)) to be bypassed. As a result, memcpy is still called with that oversized size, letting a malicious model overwrite memory beyond the intended buffer. This can lead to arbitrary memory corruption and potential code execution. This issue has been patched in version b5662.
Source	CVE (at NVD; CERT, ENISA, LWN, oss-sec, fulldisc, Debian ELTS, Red Hat, Ubuntu, Gentoo, SUSE bugzilla/CVE, GitHub advisories/code/issues, web search, more)
Debian Bugs	1108113

Vulnerable and fixed packages

The table below lists information on source packages.

Source Package	Release	Version	Status
llama.cpp (PTS)	forky, sid	9951+dfsg-3	fixed

The information below is based on the following data on fixed versions.

Package	Type	Release	Fixed Version	Urgency	Origin	Debian Bugs
llama.cpp	source	(unstable)	5713+dfsg-1			1108113

Notes

https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8wwf-w4qm-gpqr
https://github.com/ggml-org/llama.cpp/commit/3cfbbdb44e08fd19429fed6cc85b982a91f0efd5 (b5662)