Skip to content

fix: preserve Flax routing across ambiguous prefixes#1379

Open
mldangelo-oai wants to merge 2 commits into
mainfrom
mdangelo/codex/fix-flax-post-merge-review
Open

fix: preserve Flax routing across ambiguous prefixes#1379
mldangelo-oai wants to merge 2 commits into
mainfrom
mdangelo/codex/fix-flax-post-merge-review

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • follow up on merged fix: route large and renamed Flax MessagePack checkpoints #1280 by closing the remaining renamed Flax routing gaps identified during final review
  • route structurally confirmed Flax checkpoints under skipped document suffixes and behind pickle-shaped prefixes while retaining supplementary pickle findings
  • replace unbounded JSON trailing-whitespace exclusion with bounded, fail-closed ambiguity handling and preserve XML/PMML and safetensors ownership

Review Context

This addresses the unresolved review findings raised after #1280 had already been merged:

  • large JSON-prefix / bounded trailing-whitespace handling
  • protocol-0 and binary pickle-prefix Flax collisions
  • skipped document-suffix Flax checkpoints
  • checkpoint suffix ownership, already confirmed fixed in merged main

Two independent review passes validated the actionable paths. One rejected an initial simplification that would have silently skipped a JSON-looking MessagePack prelude with a later malicious Flax object; this PR retains that case as incomplete coverage instead.

Validation

  • uv run ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1 (6352 passed, 16 skipped)
  • focused routing/scanner/core suite (663 passed)
  • live CLI probes for document-suffix Flax, pickle-prefix overlap, PMML precedence, safetensors precedence, and JSON-looking fail-closed ambiguity

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 684.66ms -> 684.82ms (+0.0%).

Workload Benchmark Target Size Files Baseline Current Change Status
chunked-upload-stream tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream chunked_stream 278.2 KiB 1 17.85ms 18.77ms +5.1% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex] nested_hex 130 B 1 418.5us 436.5us +4.3% stable
clean-training-checkpoint tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint safe_large 278.2 KiB 1 14.76ms 15.38ms +4.2% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64] nested_base64 98 B 1 419.7us 409.6us -2.4% stable
padded-multi-stream-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload multi_stream_padded 4.1 KiB 1 1.55ms 1.51ms -2.0% stable
direct-malicious-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload malicious_reduce 52 B 1 1.48ms 1.46ms -1.4% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw] nested_raw 78 B 1 414.1us 409.2us -1.2% stable
suspicious-pickle-intake tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake suspicious-intake 183.8 KiB 4 89.32ms 88.77ms -0.6% stable
warm-cache-rescan tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan release-candidate 547.3 KiB 32 51.43ms 51.72ms +0.6% stable
duplicate-heavy-registry tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot registry-snapshot 915.2 KiB 13 197.86ms 197.22ms -0.3% stable
single-checkpoint-preflight tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load single_checkpoint.pkl 183.0 KiB 1 37.46ms 37.54ms +0.2% stable
mixed-model-repository tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository release-candidate 547.3 KiB 32 271.69ms 271.20ms -0.2% stable

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ac7b8e7132

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/file/detection.py
Comment thread modelaudit/utils/file/detection.py
Comment thread modelaudit/utils/file/detection.py Outdated
…ax-post-merge-review

# Conflicts:
#	modelaudit/utils/file/detection.py
#	tests/utils/file/test_filetype.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8d54aa3fe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2395 to +2399
if _looks_like_binary_pickle_protocol(prefix[:4]) or _looks_like_proto0_or_1_pickle(
pickle_probe_sample,
sample_is_prefix=size > len(pickle_probe_sample),
):
routes.append("pickle")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate pickle-shaped overlaps before exposing them

Fresh evidence is that the new parse-only suppression only runs inside merge_flax_msgpack_overlap_findings, but this route is also used earlier as a trusted fallback when Flax is disabled or the user selects only pickle (see core.py scanner-selection fallback). In that context, a benign Flax MessagePack stream beginning with ordinary scalars 0x80 0x04 is promoted to the pickle scanner as the primary scanner and can still surface pickle parse failures instead of a clean Flax skip; validate the binary pickle stream before appending this overlap route, or keep unvalidated near-matches out of the fallback route list.

Useful? React with 👍 / 👎.

Comment on lines +2375 to +2377
return probe_state is True or (
probe_state is None and ext not in _FLAX_MSGPACK_CONTENT_ROUTE_ALLOWED_DECLARED_SUFFIXES
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail closed for inconclusive document-suffix Flax probes

For .txt/.md/.rst/.markdown payloads, this now drops any Flax candidate whose bounded MessagePack probe is inconclusive. A disguised checkpoint can put more than the inline-scalar budget (or another oversized valid MessagePack object) before the later params map; _probe_flax_msgpack_checkpoint_file() then returns None, detect_file_format_for_skip_filter() returns unknown, and default directory scans skip the text file entirely. Since these suffixes were explicitly added to preserve malicious renamed Flax checkpoints, inconclusive probes under those suffixes need to be preserved/fail-closed rather than treated as ordinary documents.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant