CYBERSECURITY researchers disclosed a critical out-of-bounds read vulnerability in Ollama, tracked as CVE-2026-7482 (CVSS score: 9.1) that could allow a remote, unauthenticated attacker to leak the entire Ollama process memory, potentially exposing environment variables, API keys, system prompts and user data.
The flaw stems from Ollama’s use of the unsafe package in the WriteTo() function when creating a model from a GGUF file, with the /api/create endpoint accepting a crafted GGUF file that inflates the tensor offset and size beyond the file length; exploitation could read past the heap during quantization.
The vulnerability, codenamed Bleeding Llama by Cyera, is believed to affect over 300,000 servers globally, and Ollama before 0.17.1 is cited as containing this heap out-of-bounds read vulnerability in the GGUF model loader. The exploit chain involves uploading a crafted GGUF, triggering the out-of-bounds read via /api/create, and exfiltrating data through /api/push to an attacker-controlled registry, with risk heightened if Ollama is connected to tools like Claude Code.
Users are urged to apply fixes, restrict network access, and consider adding authentication in front of Ollama instances. According to CVE[.]org, the flaw description further anchors these details.