Change clocks to use sim-time aware ROS Time#6063
Conversation
|
To be tested! I opened the PR to see if I find obvious issues in CI first |
|
One risk I see with this PR: The default clock type of By replacing However for I do think that e.g. controller loop should be using RCL_STEADY_TIME i.e. it shouldn't be affected by e.g. NTP changes. What do you think? |
There was a problem hiding this comment.
Especially for the controller server, costmap rates, can you comment on the previous discussions on this subject why we left them the way that they are?
I do think that e.g. controller loop should be using RCL_STEADY_TIME i.e. it shouldn't be affected by e.g. NTP changes. What do you think?
This is a key part of that, I think. As well as for many algorithms that cannot/should not be run at multiples of rate in simulation due to inability to realistically ever keep up - causing incorrect behaviors when running at > 1x speed which are not deterministic at 1x speed.
I mentioned in the ticket that #3325 (comment) is a good summary of the current state
Regarding your previous conclusion here:
I mean it highly depends on the computational complexity of the plugins and the machine's specs no? Without hard numbers yet, but it seems like I can run MPPI on 10 Hz at 5x speed on my Intel(R) Core(TM) Ultra 9 285H.
In case you remember, would be helpful to share more details so I can try to reproduce. The setup I tested so far is quite a simple one with loopback simulation
For this, it's not a blocker, we should just find out a way to switch between "/clock time" and RCL_STEADY_TIME instead of RCL_SYSTEM_TIME (As mentioned, a RCL_ROS_TIME clock will currently choose between "/clock time" and RCL_SYSTEM_TIME according to |
|
I don't recall the specific issues I ran into from that long ago unfortunately. I do remember frustrating number of system tests became flaky though. I'll rerun CI a few times and see if that's still the case here.
On this point for TimedBehavior, ControllerServer, Costmap2DROS, VelocitySmoother, and WaypointFollower: I generally agree. However, I think that the following is too brute force in the application code I don't think we can set the clock type when not simulation time in the NodeOptions unfortunately. Maybe though something to consider adding? The next best thing would be to create a Are there any cases for Rate we want to use NTP corrections? I kind of doubt it. Maybe missed a few?
Given the BT.CPP changes required, can we change this to a draft until that is merged by Davide? |
it's merged already :D We still need to keep BT.CPP in the vendor.repos until next bloom release I presume |
|
btw looks like the Jazzy and Kilted workflows don't build the underlay from vendor.repos? Does that mean that if this get merged the workflows will remain broken? Is that new to you? |
|
Yes, and that is technically expected. I'm on the fence about it, but tentatively going to update the job to use the repos file for now. The main issue is that what happens if changes are made to a key dependency like BT.CPP which are not backward compatible and cannot/will not be updated in Humble or Jazzy. Our CI works for those distributions, but users themselves cannot do it without also updating something like BT.CPP which from v3 to v4 may fundamentally not work with their existing code. That's not an issue today since Davide appears to release v4 up to date in every ROS distribution, so I'll kick this can down the road for another time if this becomes an issue. Here's a PR to resolve: #6069 |
|
ros2/rclcpp#3122 made an issue for rclcpp but will will need to find a local solution in the meantime - I'm working on that |
|
Don't review yet |
|
Actually in terms of changes that's pretty much it so you can take a look already. Just didn't have the time to make a nice description but basically, with the help of some custom Nav2 wrappers of |
There was a problem hiding this comment.
Otherwise the code that's changed LGTM.
I will say though that I need to look at this again with fresh eyes. Does this imply that we should actually update all Rate's or Timers to use this as well? Is there a downside to that in any case? I do like the idea of abstracting even more rclcpp::'s into nav2::'s but only if that's really the right answer.
There was a problem hiding this comment.
Pull request overview
Updates Nav2 timing primitives to be robust to system clock jumps and simulation-time execution by selecting an appropriate clock (ROS/sim clock when use_sim_time=true, otherwise steady/monotonic time) for rates and timers.
Changes:
- Added
nav2::selectClock()andnav2::WallRateand migrated several rate-limited loops to use it. - Added sim-time-aware
create_wall_timerhelpers (free function +nav2::LifecycleNodeshadow) and migratedLifecycleManagertimers. - Updated BT execution loop (
LoopRate+BehaviorTreeEngine) to use an injected clock and a wakeup-driven wait loop.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/underlay.repos | Pins BehaviorTree.CPP to a specific commit to pick up Tree::wakeUpSignal() support. |
| nav2_waypoint_follower/src/waypoint_follower.cpp | Migrates waypoint follower loop rate to nav2::WallRate. |
| nav2_ros_common/include/nav2_ros_common/wall_rate.hpp | Introduces nav2::selectClock() and nav2::WallRate for sim-time-aware rate limiting. |
| nav2_ros_common/include/nav2_ros_common/lifecycle_node.hpp | Shadows create_wall_timer() to use the selected clock for lifecycle nodes. |
| nav2_ros_common/include/nav2_ros_common/interface_factories.hpp | Adds nav2::create_wall_timer() helper using clock selection. |
| nav2_lifecycle_manager/src/lifecycle_manager.cpp | Migrates timers (init_timer_, bond timers) to nav2::create_wall_timer(). |
| nav2_costmap_2d/src/costmap_2d_ros.cpp | Migrates costmap update loop to nav2::WallRate. |
| nav2_controller/src/controller_server.cpp | Migrates controller compute loop to nav2::WallRate. |
| nav2_behaviors/include/nav2_behaviors/timed_behavior.hpp | Migrates behavior cycle loop to nav2::WallRate. |
| nav2_behavior_tree/test/plugins/action/test_bt_action_node.cpp | Updates test to new LoopRate constructor signature (explicit clock). |
| nav2_behavior_tree/src/behavior_tree_engine.cpp | Injects selected clock into BT loop rate handling. |
| nav2_behavior_tree/include/nav2_behavior_tree/utils/loop_rate.hpp | Reworks sleeping to poll an injected clock using BT wakeup signal. |
| nav2_behavior_tree/include/nav2_behavior_tree/behavior_tree_engine.hpp | Adds rate_clock_ member to support sim/steady timing for BT loop. |
| nav2_amcl/src/amcl_node.cpp | Minor include cleanup and explicitly calls this->create_wall_timer(). |
|
Otherise just this topic:
|
I'm going to use the terminology in ros2/rclcpp#3122 (comment) For rates and timers, I think always RCL_ROS_STEADY_TIME is wanted over RCL_ROS_TIME or RCL_SYSTEM_TIME but probably there are a few places, e.g. in tests or maybe rviz panels, where we need RCL_STEADY_TIME (not sim-time aware). I would need to do a deep dive |
|
I think as part of this, we should, since its an easy grep + replace if it turns out there's no real reason to use the other now. |
|
Ok I did another pass and found more
I'm torn because they all have some drawbacks but on the other hand whatever we choose is a (hopefully) temporary option until ros2/rclcpp#3122 in implemented. Because of that, I tend towards option 1 |
Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
… tree and lifecycle node Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
…p rate Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
…aviorTester Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
Signed-off-by: Tony Najjar <[email protected]>
…ocitySmoother timer Signed-off-by: Tony Najjar <[email protected]>
…n in VelocitySmoother timer" This reverts commit 891b725. Signed-off-by: Tony Najjar <[email protected]>
…eleopBehaviorTester" This reverts commit 77501d1. Signed-off-by: Tony Najjar <[email protected]>
This reverts commit 7f9bad4. Signed-off-by: Tony Najjar <[email protected]>
…n in loop rate" This reverts commit f56ce62. Signed-off-by: Tony Najjar <[email protected]>
This reverts commit f6de7e5. Signed-off-by: Tony Najjar <[email protected]>
This reverts commit a0c3036. Signed-off-by: Tony Najjar <[email protected]>
This reverts commit d50c80e. Signed-off-by: Tony Najjar <[email protected]>
…anager Co-authored-by: Copilot <[email protected]> Signed-off-by: Tony Najjar <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Tony Najjar <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Tony Najjar <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Tony Najjar <[email protected]>
1d172e5 to
e97d675
Compare
Basic Info
Description of contribution in a few bullet points
Rate-limited control loops now use ROS time in simulation, steady time on real hardware. The
ControllerServer,VelocitySmootherandCostmap2DROSupdate loops select their clock based onuse_sim_time: the node's ROS clock (follows/clock) whentrue, or a monotonicRCL_STEADY_TIMEclock whenfalse. This makes the loops sim-time-aware while remaining immune to NTP clock jumps on real hardware.All other production
rclcpp::Rateandrclcpp::WallRateusages in continuous control/update loops migrated to use the node clock (this->get_clock()), making them sim-time-aware:TimedBehavior(behaviors)DockingServer(7 loops)FollowingServer(4 loops)RouteTrackerWaypointFollowerInputAtWaypointWhen not using sim-time, this will fallback to system time but we accept this as these components depend less crucially on monotonic precise timing than the components mentioned above
create_wall_timer→create_timerinAmclNode(save pose timer) andVelocitySmoother(smoother timer), so these timers respect sim time.BT
LoopRateupdated to accept an injected clock and use a polling loop viatree->wakeUpSignal()instead oftree->sleep(), which was hardwired to wall-clock time. When running with sim time, the loop polls in short wall-clock intervals and re-checks the target clock; for steady/system clocks it waits for the exact remaining duration.No new abstractions introduced. The clock selection is done inline at the two call sites that need it (controller server and costmap), using a straightforward pattern:
Polling/waiting loops and wall-time-bound operations left on default clock.
rclcpp::Rateusages that poll for readiness (TF availability, costmap initialization, DDS discovery, bond heartbeats) or wait for CPU/network-bound operations don't need sim-time awareness and are left unchanged. Only continuous control and update loops are migrated.Test files left unchanged except for system tests that run with Gazebo (
assisted_teleop_behavior_tester,wait_behavior_tester) whererclcpp::Rateis used in a control-like loop withuse_sim_time=true.Dependencies
Requires BehaviorTree.CPP#1127 (
Tree::wakeUpSignal()getter) — already merged.Description of documentation updates required from your changes
None. No new parameters, plugins, or public API changes.
Description of how this change was tested
TODO
Future work that may be required in bullet points