refactor: fuse set_world_poses/set_scales into single _compose_fabric_transform#5674
refactor: fuse set_world_poses/set_scales into single _compose_fabric_transform#5674pv-nvidia wants to merge 3 commits into
Conversation
8b5f661 to
2ca6310
Compare
There was a problem hiding this comment.
🔍 Code Review Summary
This PR successfully refactors FabricFrameView to consolidate duplicate kernel-launch logic into a single _compose_fabric_transform method and addresses a critical assertion bug. The changes are well-structured and improve both correctness and maintainability.
✅ Correctness
No blocking issues found.
The refactoring maintains behavioral equivalence:
_compose_fabric_transformcorrectly handlesNoneparameters via empty arrays, preserving the kernel's ability to skip unspecified components- The
_fabric_usd_sync_doneflag is set correctly after writes - The
assert→RuntimeErrorchange in_rebuild_fabric_arraysis a valuable fix — assertions are stripped underpython -O, which could silently produce wrong poses or out-of-bounds kernel indices in production
Minor observation:
- The
_compose_fabric_transformmethod always calls_fabric_hierarchy.update_world_xforms()even when only scales are updated. This matches the originalset_scalesbehavior, so no regression, but worth noting for future optimization.
⚡ Performance
Net positive impact:
-
Reduced
PrepareForReusecalls — The critical fix:_sync_fabric_from_usd_oncenow invokesPrepareForReuseexactly once (via the fused compose) instead of twice. This prevents potential topology-change signal masking. -
Single kernel launch for initial sync — USD→Fabric sync now uses one
wp.launchinstead of two, reducing kernel launch overhead.
Micro-optimization opportunity (non-blocking):
# Current: creates empty arrays every call
empty3 = wp.zeros((0, 3), dtype=wp.float32, device=self._device)
empty4 = wp.zeros((0, 4), dtype=wp.float32, device=self._device)These could be cached as class-level constants, but the overhead is negligible for zero-sized arrays.
📝 Maintainability
Excellent improvements:
- Code deduplication — ~40 lines of duplicate kernel-launch boilerplate consolidated into one reusable method
- Clearer naming —
empty3/empty4communicates intent better thandummy3/dummy4 - Good documentation — The
_compose_fabric_transformdocstring clearly explains the single-PrepareForReuseguarantee and the_fabric_usd_sync_doneside effect - Test modernization —
wp.to_torch(fab_pos)→fab_pos.torchuses the current ProxyArray API
Changelogs are well-formatted and accurately describe both the behavioral changes and the assert fix.
🔗 Service Locator (Bundled Feature)
The typed service locator in SimulationContext is a clean pattern:
- Type-safe via
TypeVar - Automatic
close()onclear_instance() - Good test coverage for edge cases (re-registration, missing close method, etc.)
One consideration: set_service does not auto-close the replaced service (caller responsibility per docstring). This is documented but could be a subtle footgun. Consider adding an optional auto_close=True parameter in the future if replacement becomes common.
Verdict: Approve ✅
The refactoring achieves its goals cleanly:
- Eliminates duplicate code
- Fixes a real bug (assert under
-O) - Reduces PrepareForReuse call count
- Adds useful infrastructure (service locator)
All 36 existing Fabric tests pass per PR description. Ship it! 🚀
2ca6310 to
1b12c6c
Compare
Add get_service(cls) / set_service(cls, instance) — a lightweight typed singleton registry on SimulationContext, keyed by service class. This lets backend-specific caches (e.g. Fabric hierarchy handles) register themselves without polluting SimulationContext with backend-specific fields or imports. Services with a close() method are automatically closed: - On replacement via set_service() - On teardown via clear_instance() No existing behavior changes — this is purely additive.
- 7 unit tests covering get_service, set_service, replacement lifecycle, close-on-clear_instance, multiple service types, and idempotent re-registration - Changelog entry for the new service locator API
Extract _compose_fabric_transform() to deduplicate the kernel-launch logic shared by set_world_poses and set_scales. The initial USD->Fabric sync now composes position, orientation, and scale in one call, so PrepareForReuse is invoked exactly once per logical update. Also replace assert with RuntimeError in _rebuild_fabric_arrays so the topology-change guard survives python -O.
1b12c6c to
aad8a83
Compare
Problem
FabricFrameViewhad duplicated kernel-launch logic inset_world_posesandset_scales, and the initial USD→Fabric sync called both methods sequentially. This meant:PrepareForReuse— the initial USD→Fabric sync in_sync_fabric_from_usd_oncecalledset_world_posesthenset_scales, each invokingPrepareForReuse. A second non-idempotentPrepareForReusecall could mask a topology-change signal that should have triggered a fabricarray rebuild.Solution
Extract
_compose_fabric_transform(positions=None, orientations=None, scales=None, indices=None)— a single method that composes any subset of transform components into one kernel launch. Components left asNoneare skipped via empty arrays.set_world_poses→ delegates to_compose_fabric_transform(positions=..., orientations=...)set_scales→ delegates to_compose_fabric_transform(scales=...)_sync_fabric_from_usd_once→ single fused call with all three componentsAdditional fix
The topology-change invariant guard in
_rebuild_fabric_arraysusedassert, which is stripped underpython -O. Replaced withraise RuntimeErrorso it's always active.Tests
All 36 existing Fabric tests pass (+ 2 xfail).
Merge Order
This PR is part of a stacked series for Fabric-accelerated local poses:
this depends on