Skip to content

experiment with texture streaming off#5686

Draft
huidongc wants to merge 8 commits into
isaac-sim:developfrom
huidongc:deterministic-rtx-rendering
Draft

experiment with texture streaming off#5686
huidongc wants to merge 8 commits into
isaac-sim:developfrom
huidongc:deterministic-rtx-rendering

Conversation

@huidongc
Copy link
Copy Markdown
Collaborator

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.

Fixes # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (existing functionality will not work without user modification)
  • Documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@huidongc huidongc marked this pull request as draft May 19, 2026 05:22
@github-actions github-actions Bot added isaac-sim Related to Isaac Sim team isaac-lab Related to Isaac Lab team labels May 19, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 19, 2026

Greptile Summary

This experimental PR disables RTX texture streaming in the headless app configuration and removes the pytest-rerunfailures flaky-retry mark from all RTX rendering tests, with the hypothesis that texture streaming was a source of render non-determinism.

  • apps/isaaclab.python.headless.kit: enableTextureStreaming flipped from truefalse; this is a global setting change that applies to every headless Isaac Lab process, not just CI tests.
  • rendering_test_utils.py: _FLAKY_MARK replaced with [] (original pytest.mark.flaky left as a comment), removing the three-retry safety net that was explicitly added for NVBUG#6152566.

Confidence Score: 3/5

Not ready to merge — the flaky retry guard was removed before confirming that texture streaming = false fully eliminates the underlying render non-determinism.

The original _FLAKY_MARK comment explicitly states RTX outputs are non-deterministic on every CI run (NVBUG#6152566). Removing the three-retry guard while that bug is still open means the first non-deterministic CI execution becomes an outright failure. The kit-file change is global and undocumented, risking silent impact on non-test headless workloads.

Both changed files need attention: rendering_test_utils.py for the removal of the retry guard, and apps/isaaclab.python.headless.kit for the undocumented global scope of the texture streaming change.

Important Files Changed

Filename Overview
apps/isaaclab.python.headless.kit Disables RTX texture streaming globally for all headless runs; change is intentional per PR title but lacks a comment explaining scope/intent and could affect non-test workloads.
source/isaaclab_tasks/test/rendering_test_utils.py Removes the flaky-retry mark from all RTX rendering tests by setting _FLAKY_MARK to an empty list (with the original value commented out); CI reliability regresses if texture streaming = false does not fully fix the underlying non-determinism (NVBUG#6152566).

Reviews (1): Last reviewed commit: "experiment with texture streaming off" | Re-trigger Greptile

# Low-resolution camera outputs from RTX renderers are not deterministic enough to pass golden image testing
# on every CI run. (NVBUG#6152566)
_FLAKY_MARK = pytest.mark.flaky(max_runs=3, min_passes=1)
_FLAKY_MARK = [] # pytest.mark.flaky(max_runs=3, min_passes=1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Removing flaky retry without evidence of determinism fix

The comment directly above this line explains why the mark existed: RTX renderer outputs are "not deterministic enough to pass golden image testing on every CI run" (NVBUG#6152566). Replacing the mark with an empty list means any non-deterministic render will cause an outright CI failure instead of a retry. If disabling texture streaming does not fully resolve the NVBUG, the test suite will regress in CI reliability. Before landing this, it would be worth confirming that the texture-streaming change actually eliminates the non-determinism that drove the original max_runs=3 guard.

# Low-resolution camera outputs from RTX renderers are not deterministic enough to pass golden image testing
# on every CI run. (NVBUG#6152566)
_FLAKY_MARK = pytest.mark.flaky(max_runs=3, min_passes=1)
_FLAKY_MARK = [] # pytest.mark.flaky(max_runs=3, min_passes=1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Commented-out code should be removed rather than left as a dormant inline comment, especially for a module-level constant that is read in every pytest.param call. If the flaky mark is being intentionally retired, delete the old value; if this is experimental, track it in an issue/TODO instead.

Suggested change
_FLAKY_MARK = [] # pytest.mark.flaky(max_runs=3, min_passes=1)
_FLAKY_MARK = []

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

renderer.multiGpu.enabled = true
renderer.multiGpu.autoEnable = true
'rtx-transient'.resourcemanager.enableTextureStreaming = true
'rtx-transient'.resourcemanager.enableTextureStreaming = false
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Global setting change affects all headless users, not just tests

isaaclab.python.headless.kit is the shared app configuration loaded for every headless Isaac Lab run (training scripts, data-collection pipelines, etc.), not solely by CI rendering tests. Disabling texture streaming may change startup time, VRAM usage, and rendering fidelity for all downstream users. There is no comment here (or in the PR description) explaining whether this is intended to be permanent or is scoped to the test experiment.

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR experiments with disabling texture streaming to improve RTX rendering determinism (NVBUG#6152566). The approach is reasonable as a validation experiment.

Key Observations

Texture Streaming Change (apps/isaaclab.python.headless.kit):

  • Disabling enableTextureStreaming is a valid approach to investigate rendering determinism issues, as texture streaming can introduce non-deterministic memory allocation patterns.
  • Consider whether other .kit files (e.g., isaaclab.python.headless.rendering.kit) should have consistent settings if this experiment proves successful.

Flaky Mark Removal (source/isaaclab_tasks/test/rendering_test_utils.py):

  • Setting _FLAKY_MARK = [] is technically correct — pytest accepts an empty iterable for marks=.
  • This change is appropriate for an experiment: if rendering is now deterministic, tests should pass consistently; if not, failures will indicate the fix is incomplete.
  • The commented-out original code is helpful for reverting if needed.

Suggestions

  1. Experiment validation: Monitor CI rendering-correctness results closely. If tests pass consistently across multiple runs, the texture streaming change is likely effective.

  2. Consider scope: If successful, apply the texture streaming change to other headless kit variants for consistency.

  3. Update comment if keeping: If this change becomes permanent, update the comment above _FLAKY_MARK to explain why the flaky mark is no longer needed (rather than just commenting out the old value).

  4. PR description: Consider filling in the PR template with details about the experiment and expected outcomes.

Verdict

This is a well-structured experiment. The CI results will determine if the approach is successful. No blocking issues found.

Reviewed with focus on architecture, error handling, and test coverage.


Update (95b0b2d): New commits add updated golden images for the texture streaming experiment. This is the expected follow-up — 16 golden baseline images (dexsuite_kuka, shadow_hand, and registered_tasks) have been regenerated with the new rendering settings. ✅ No code changes, just binary asset updates. Previous review comments on the code still stand.


Update (441245d): New commits add several unrelated changes bundled with this PR:

  • h5py lazy import fix (hdf5_dataset_file_handler.py): Moves import h5py from top-level to method-level to fix Windows startup crash (0xc0000139). ✅ Good fix — deferred imports prevent DLL load failures when h5py is not needed.

  • Version bumps: isaaclab 5.5.0→5.5.1, isaaclab_ov 0.2.1→0.3.0, isaaclab_ovphysx 2.0.0→2.1.0, isaaclab_tasks 1.8.0→1.9.0

  • ROS inference env simplification (ros_inference_env_cfg.py): Removed ~50 lines of Hubble Lab-specific robot mount configuration. Added to isaaclab_tasks CHANGELOG as "robot setup and mount configuration" change.

  • Changelog consolidation: Fragment files merged into main CHANGELOGs.

  • Additional golden images: shadow_hand and registered_tasks images updated for texture streaming experiment.

⚠️ Previous inline comments still apply — the core texture streaming change (enableTextureStreaming = false) and flaky mark removal remain unchanged. The P1 concern about removing the flaky retry guard before confirming determinism still stands.


Update (eb89a15): 16 golden images updated (cartpole and shadow_hand for both newton/physx backends). File size variations ±100 bytes, consistent with texture streaming change affecting rendered output. No code changes. ✅


Update (babfbd1): Changed approach on flaky test handling — _FLAKY_MARK is now set back to pytest.mark.flaky(max_runs=3, min_passes=1), but individual marks=_FLAKY_MARK arguments have been removed from all 14 RTX renderer test params. This means the flaky decorator is defined but not applied to any tests.

The net effect is the same as before (no flaky retry for RTX tests), but the code is cleaner — rather than setting _FLAKY_MARK = [] and keeping all the marks=_FLAKY_MARK lines, this removes the marks entirely. ✅ This is a cleaner solution that makes the intent clear: RTX renderer tests are expected to be deterministic with texture streaming disabled.


Update (12fbfb7): New commit adds logging configuration changes to OVRTXRendererCfg:

  • log_level default changed: "verbose" → "info" (reduces log verbosity)
  • log_file_path default changed: "/tmp/ovrtx_renderer.log" → "/dev/stdout" (logs to stdout instead of file)
  • Changelog fragment added documenting these changes

✅ These are sensible defaults for production use — stdout logging is more container/CI friendly, and "info" level is appropriate for normal operation. Users can still override these if they need verbose debugging. No concerns.


Update (e137c79): Log level default reverted from "info" back to "verbose" in OVRTXRendererCfg. Changelog fragment updated to only mention the log_file_path change (no longer documents log_level).

This reversal makes sense — "verbose" logging is likely needed during this experimental phase to capture full diagnostic output when investigating texture streaming determinism. The /dev/stdout change for log_file_path remains in place. ✅

@huidongc huidongc force-pushed the deterministic-rtx-rendering branch from fc9d8b4 to 441245d Compare May 19, 2026 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team isaac-sim Related to Isaac Sim team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant