Critical Ollama heap out-of-bounds read vulnerability (CVE-2026-7482)

Bleeding Llama bug leaks secrets from 300k Ollama servers

A critical flaw in the Ollama AI server allows unauthenticated attackers to read the process memory remotely, potentially exposing API keys, environment variables and user prompts. The issue, identified as CVE-2026-7482, carries a CVSS score of 9.1 and affects all versions before 0.17.1.

The vulnerability is an out‑of‑bounds read in the GGUF model loader. When the /api/create endpoint receives a crafted GGUF file that declares a tensor offset and size larger than the actual file, the WriteTo() function in fs/ggml/gguf.go and server/quantization.go reads past the allocated heap buffer during quantization. This leakage can reveal any data residing in the Ollama process memory.

Leaked contents may include API keys, tokens, system prompts and concurrent chat histories. Attackers can exfiltrate the stolen data by uploading the resulting model via the /api/push endpoint to a server they control, a step that requires no authentication. Ollama ships without any auth mechanism and listens on all interfaces when the OLLAMA_HOST variable is set to 0.0.0.0, a configuration common in public deployments.

Cyera estimates that more than 300 000 Ollama instances are exposed to the issue, although no threat actor has been observed exploiting it in the wild yet. The flaw highlights the danger of running AI services that expose unauthenticated APIs to the internet, a pattern seen in several recent container‑related security events.

Defenders should upgrade Ollama to version 0.17.1 or newer as soon as possible. If immediate patching is not feasible, network restrictions such as blocking external access to the /api/create and /api/push paths or binding the service to localhost only can reduce risk. Enabling reverse‑proxy authentication and logging unusual model uploads also helps detect attempted exploitation.

Maintaining an inventory of all Ollama deployments, applying patch management promptly and monitoring for anomalous heap‑access behaviours are prudent steps. Subscribing to vendor advisories and keeping the software supply chain under review will limit exposure to similar flaws in the future.