test: stabilize memory growth benchmark#1367
Open
mldangelo-oai wants to merge 75 commits into
Open
Conversation
…l-routing' into mdangelo/codex/fix-renamed-coreml-routing # Conflicts: # modelaudit/utils/file/detection.py
…tagraph-routing' into mdangelo/codex/review-pr1287
…entative-protobuf-routing # Conflicts: # tests/utils/file/test_filetype.py
…tobuf-routing' into mdangelo/codex/fix-renamed-coreml-routing
ianw-oai
reviewed
May 25, 2026
Contributor
ianw-oai
left a comment
There was a problem hiding this comment.
Leaving this unapproved because this branch still measures cached scans; the warm-up and measured calls omit cache_enabled=False, so repeat-scan memory growth can be hidden.
ianw-oai
approved these changes
May 25, 2026
Contributor
Performance BenchmarksCompared
|
d1d573b to
aa1e393
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
test_memory_usage_stabilityWhy
test_memory_usage_stabilitycurrently takes its baseline before the first scan initializes lazy imports and process-scoped analysis caches. In the dependency-complete local environment it fails on the unchanged stack parent (140.43MBRSS growth versus<50MB) and failed during validation of #1366 as well (74.23MBto130.46MB), even though #1366's hosted benchmark lane is green and its changed signatures do not occur intests/assets.The test is intended to detect retained growth from repeated scanning, not cold-start initialization. A single warm-up scan followed by four measured repeats preserves five total scans while measuring that steady-state contract.
Validation
env -u UV_EXCLUDE_NEWER -u VIRTUAL_ENV PROMPTFOO_DISABLE_TELEMETRY=1 uv --no-config run --locked pytest tests/test_performance_benchmarks.py::TestPerformanceBenchmarks::test_memory_usage_stability -q --maxfail=1failed on fix: detect asyncio subprocess launches in embedded Python #1366 at130.46MB; the unchanged parent fix: resolve jit subprocess launch calls #1365 failed in a separate worktree at140.43MB1 passed in 74.75s)env -u UV_EXCLUDE_NEWER -u VIRTUAL_ENV uv --no-config run --locked ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/(398 files left unchanged)env -u UV_EXCLUDE_NEWER -u VIRTUAL_ENV uv --no-config run --locked ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/(All checks passed!)env -u UV_EXCLUDE_NEWER -u VIRTUAL_ENV uv --no-config run --locked mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/(Success: no issues found in 453 source files)env -u UV_EXCLUDE_NEWER -u VIRTUAL_ENV PROMPTFOO_DISABLE_TELEMETRY=1 uv --no-config run --locked pytest -n auto -m "not slow and not integration" --maxfail=1(6490 passed, 15 skipped)Stack
fix: detect asyncio subprocess launches), which has completed hosted CI successfully.