fix: preserve Flax routing across ambiguous prefixes#1379
fix: preserve Flax routing across ambiguous prefixes#1379mldangelo-oai wants to merge 2 commits into
Conversation
Performance BenchmarksCompared
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ac7b8e7132
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ax-post-merge-review # Conflicts: # modelaudit/utils/file/detection.py # tests/utils/file/test_filetype.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b8d54aa3fe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if _looks_like_binary_pickle_protocol(prefix[:4]) or _looks_like_proto0_or_1_pickle( | ||
| pickle_probe_sample, | ||
| sample_is_prefix=size > len(pickle_probe_sample), | ||
| ): | ||
| routes.append("pickle") |
There was a problem hiding this comment.
Validate pickle-shaped overlaps before exposing them
Fresh evidence is that the new parse-only suppression only runs inside merge_flax_msgpack_overlap_findings, but this route is also used earlier as a trusted fallback when Flax is disabled or the user selects only pickle (see core.py scanner-selection fallback). In that context, a benign Flax MessagePack stream beginning with ordinary scalars 0x80 0x04 is promoted to the pickle scanner as the primary scanner and can still surface pickle parse failures instead of a clean Flax skip; validate the binary pickle stream before appending this overlap route, or keep unvalidated near-matches out of the fallback route list.
Useful? React with 👍 / 👎.
| return probe_state is True or ( | ||
| probe_state is None and ext not in _FLAX_MSGPACK_CONTENT_ROUTE_ALLOWED_DECLARED_SUFFIXES | ||
| ) |
There was a problem hiding this comment.
Fail closed for inconclusive document-suffix Flax probes
For .txt/.md/.rst/.markdown payloads, this now drops any Flax candidate whose bounded MessagePack probe is inconclusive. A disguised checkpoint can put more than the inline-scalar budget (or another oversized valid MessagePack object) before the later params map; _probe_flax_msgpack_checkpoint_file() then returns None, detect_file_format_for_skip_filter() returns unknown, and default directory scans skip the text file entirely. Since these suffixes were explicitly added to preserve malicious renamed Flax checkpoints, inconclusive probes under those suffixes need to be preserved/fail-closed rather than treated as ordinary documents.
Useful? React with 👍 / 👎.
Summary
Review Context
This addresses the unresolved review findings raised after #1280 had already been merged:
mainTwo independent review passes validated the actionable paths. One rejected an initial simplification that would have silently skipped a JSON-looking MessagePack prelude with a later malicious Flax object; this PR retains that case as incomplete coverage instead.
Validation
uv run ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/uv run ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1(6352 passed, 16 skipped)663 passed)