Feature: async node by nadavelkabets · Pull Request #1620 · ros2/rclpy

nadavelkabets · 2026-03-11T20:46:15Z

Summary

AsyncNode brings native asyncio support for rclpy, enabling async/await throughout subscription callbacks, service handlers, client calls, timers, and clock sleeps.

Usage examples

run() — simple reactive node:

with rclpy.init():
    node = AsyncNode("my_node")
    node.create_subscription(String, "/topic", callback, 10)
    await node.run()

async with — composable, user-controlled lifetime:

with rclpy.init():
    async with AsyncNode("my_node") as node:
        node.create_subscription(String, "/topic", callback, 10)
        await some_async_function()

client.call() — async service call, no futures or spinning:

async with AsyncNode("my_node") as node:
    client = node.create_client(SetBool, "/set_bool")
    response = await client.call(SetBool.Request(data=True))

clock.sleep() — sim-time-aware, cancels on node shutdown:

async with AsyncNode("my_node") as node:
    await node.get_clock().sleep(2.0)  # respects sim time

aiohttp — compose with any asyncio library in a callback:

async def handle_set(request, response):
    async with aiohttp.request("GET", "http://api.example.com/data") as resp:
        response.message = await resp.text()
    return response

node = AsyncNode("my_node")
node.create_service(SetBool, "/my_service", handle_set)
await node.run()

serial — bridge ROS topics to a serial port:

reader, writer = await serial_asyncio.open_serial_connection(url="/dev/ttyUSB0")
async with AsyncNode("serial_bridge") as node:
    pub = node.create_publisher(Byte, "/serial/rx", 10)
    while data := await reader.read(1):
        pub.publish(Byte(data=data))

Design

Core mechanism

DDS pushes work onto the asyncio event loop instead of an executor polling a wait set:

DDS thread → on_new_message_callback → loop.call_soon_threadsafe(event.set) → asyncio executes

One reader task per entity waits on an asyncio Event, takes data, and runs the callback. The DDS callback only sets the event. This gives natural backpressure — the entity won't take another message until the current callback yields.

Structured concurrency

The design follows structured concurrency (Trio, asyncio TaskGroup). Every task has a clear owner, lifetimes are bounded by async with scopes, and no task outlives its parent.

The node's outer TaskGroup owns entity reader tasks. Subscriptions and services can run callbacks concurrently via a nested inner TaskGroup. When an entity is destroyed, the inner group cancels all in-flight callbacks before the entity handle is cleaned up — no orphaned callbacks. This is required for service correctness (a callback needs the live service handle to send its response), and subscriptions use the same pattern for consistency. Clients and timers don't need inner groups — clients only route responses to futures, timers dispatch sequentially.

Resource cleanup is deterministic: when the async with block exits, all reader tasks have finished, all DDS callbacks have been cleared, and no orphaned coroutines remain.

Two entry points

Both entry points share the same lifecycle:

async with is for composable use cases where the user controls the lifetime (bridges, tests, multi-protocol applications).
run() is for simple reactive nodes.

Class hierarchy

BaseNode
├── Node                    (executor-based, existing)
└── AsyncNode               (asyncio-based, new)

BasePublisher
├── Publisher               (executor: callback groups, executor event)
└── AsyncPublisher          (no task — fire-and-forget publish)

BaseSubscription
├── Subscription            (executor: callback groups, executor event)
└── AsyncSubscription       (owns reader task, optional concurrent dispatch)

BaseService
├── Service                 (executor: callback groups, executor event)
└── AsyncService            (owns reader task, optional concurrent dispatch)

BaseClient
├── Client                  (executor: callback groups, call_async → rclpy.Future)
└── AsyncClient             (owns reader task, call → asyncio.Future)

BaseTimer
├── Timer                   (executor: callback groups, executor event)
└── AsyncTimer              (owns timer loop, sequential dispatch)

BaseClock
├── Clock                   (blocking sleep)
└── AsyncClock              (async sleep)

Entity-owned architecture

Each async entity class owns its full DDS bridge: event creation, DDS callback registration, the take loop, callback dispatch, and cleanup. The node is a thin coordinator — it creates C handles via the base class, wraps them in async entity classes, and hands them a reference to its TaskGroup so they can spawn their own reader tasks.

BaseNode holds shared logic (parameters, clock, logger, name resolution, graph discovery) and calls factory methods polymorphically during init, so AsyncNode and Node each produce their own entity types without the base class knowing which subclass it's in.

Entities self-remove from the node's tracking set when destroyed, via a callback passed at construction. This keeps the node's bookkeeping consistent regardless of whether destruction is triggered by the node, by user code, or by the entity's own cleanup.

Shutdown

Shutdown is synchronous and uses task cancellation as the only mechanism. Destroying the node destroys all entities (cancelling their tasks and DDS handles), cancels pending clock sleeps, then marks the node handle for deferred destruction. Each entity's destroy is idempotent.

Clock sleep and timers

Clock sleep is the async replacement for blocking sleep. For wall time it schedules a delayed callback on the event loop, for sim time it registers a jump callback that resolves when simulated time advances past the target. If the time source changes during a sleep (ROS time activated or deactivated), the sleep raises an error since the target is no longer meaningful. All pending sleeps are cancelled on node shutdown.

Timers use the same dual-mode wait pattern. They support cancel (parks the loop until reset) and reset (wakes the parked loop). Timer callbacks are always dispatched sequentially — if a callback runs longer than the period, the next tick is delayed.

Known limitations

Actions and waitables are not yet supported. Waitable support requires a set_on_ready_callback API on the waitable interface, matching the approach used by rclcpp’s EventsExecutor, which is not yet available in rclpy.

Performance

CPU usage

I ran the test_rclpy_performance.py benchmark from the EventsExecutor PR adapted for the AsyncNode.
AsyncNode nearly matches EventsExecutor performance while significantly outperforming SingleThreadedExecutor.

Running the test with uvloop.run instead of asyncio.run achieved an even lower cpu usage of 10%.

Timer latency and jitter

At 50 Hz (20 ms period), AsyncNode's mean jitter is slightly higher than the existing executors, at ~0.16 ms above SingleThreadedExecutor and ~0.38 ms above EventsExecutor.
I didn't invest much in trying to optimize this, honestly the difference is really small.

Related work

peci1 · 2026-03-11T23:04:14Z

This looks quite interesting! I haven't acquainted myself with async patterns too much yet, so I won't be of much value as a reviewer of the whole PR.

On thing has, however, caught my eye: always subscribing to clock.

This is something I wouldn't recommend. I sometimes deliberately don't set sim time on a node if I know it doesn't need it (e.g. a plain relay that just gets a message, computes something and outputs another message with the input timestamp). The reason why I do it is performance: if you have a 1000 Hz sim clock and 50 nodes... You should use ROS 1 :D So I definitely don't want to lose the ability to not subscribe clock.

Another thing: is it possible to specify service call timeouts with the async client?

nadavelkabets · 2026-03-11T23:14:36Z

1000 Hz sim clock

The existing SingleThreadedExecutor would probably fry your cpu if you published at 1000hz haha.
I reverted this change.

Another thing: is it possible to specify service call timeouts with the async client?

Yeah you could.
Notice that there are 2 separate mechanisms:

An asyncio.timeout() context manager cancels the async call operation.
Lifespan QoS setting drops the request at the middleware layer.

async with AsyncNode("node") as node:
    qos_profile = QoSProfile(lifespan=Duration(seconds=2))
    client = node.create_client(SetBool, "/set_bool", qos_profile=qos_profile)
    with asyncio.timeout(2):  
        response = await client.call(SetBool.Request(data=True))

That's a limitation of the current client rcl/rmw api.

mjcarroll · 2026-03-13T18:17:18Z

As we have discussed before, this is a very high quality contribution. It brings our interactions in the Python client library in line with what many Python developers would expect.

I agree with the refactoring into Base classes. We wouldn't want to inherit from what is already there because it could cause confusion. This is a good move.

Otherwise the API ergonomics look great, and putting it in the experimental namespace should give us a little bit of latitude on getting improvements in over time.

With that, I would suggest we do this in ~6 PRs. You mentioned 12 on zulip, but I feel like that would be a lot to manage.

I think (just guessing here):

Base Entity Classes (should be quick to review)
Moving shared logic into BaseNode
Clock and time
Async infrastructure (utilities/event loop)
Add the async entities and asyncnode objects
Tests and documentation

I think those are logical chunks that should each be pretty easy to review.

nadavelkabets

Reminder - test AsyncNode with python 3.14

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

…d before running Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

…ior in CI Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

christophebedard · 2026-04-20T13:26:10Z

Pulls: #1620
Gist: https://gist.githubusercontent.com/christophebedard/c8dc171fed3cf6b2d1b15a5670655752/raw/ef85a1eead83cacf130a0167098300798165f9c8/ros2.repos
BUILD args: --packages-above-and-dependencies rclpy
TEST args: --packages-above rclpy
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/19010

Linux
Linux-aarch64
Linux-rhel
Windows

christophebedard · 2026-04-20T17:00:00Z

CI for #1620 with:

Linux dep from Add pytest-asyncio dependency ci#868 (using https://github.com/ros2/ci/tree/feature/add-pytest-asyncio-dep)
Windows dep from Feature: add pytest-asyncio dependency ros2#1818
RHEL 10 (not the current default, RHEL 9)

Linux
Linux-aarch64
Linux-rhel
Windows

christophebedard · 2026-04-20T17:23:54Z

I retriggered the RHEL job with RHEL 10 instead of RHEL 9.

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

christophebedard · 2026-04-20T23:07:01Z

@nadavelkabets once you're done, let me know and I can trigger another full CI run. And I can keep an eye on it and hopefully merge it later once it's green (or green enough)

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

christophebedard · 2026-04-20T23:33:09Z

CI for #1620 for rclpy and above with:

Linux dep from Add pytest-asyncio dependency ci#868 (using https://github.com/ros2/ci/tree/feature/add-pytest-asyncio-dep)
Windows dep from Feature: add pytest-asyncio dependency ros2#1818
(RHEL 9 like the current default, not RHEL 10)

Linux
Linux-aarch64
Linux-rhel
Windows

sloretz · 2026-04-21T05:01:27Z

Looks like some issues:

RHEL: 100 test failures seem to be due to pytest-asyncio being extra spammy about it's on config, even in unrelated tests Warning about asyncio_mode even when unused in tests. pytest-dev/pytest-asyncio#293
Linux amd64 and aarch64: 14 test failures, pytest.PytestUnknownMarkWarning: Unknown pytest.mark.asyncio (CI issue? I don't see python3-pytest-asyncio in the logs, so did it use the wrong CI script branch?)
Windows: still waiting for results

christophebedard · 2026-04-21T05:08:03Z

Yes, I forgot to add the ros2/ci branch

christophebedard · 2026-04-21T05:14:33Z

Windows passed with a single test failure before.

christophebedard · 2026-04-21T05:16:09Z

And ci_linux passed before the last minor changes

christophebedard · 2026-04-21T05:24:34Z

Windows passed with a single test failure before.

And it now failed with the same single test failure. Nadav says this might be a Windows clock accuracy thing. I think this should be fixable in a follow-up PR

sloretz · 2026-04-21T05:25:01Z

CI - just linux and rhel, and just testing rclpy since those test are the only ones that failed, and they appear to have failed for a CI issue that should be fixed by ros2/ci#871 . I assume the above windows results will still be valid and don't need a re-run

Linux
Linux aarch 64
Linux RHEL (9)

christophebedard · 2026-04-21T05:32:29Z

@sloretz are you able to keep an eye on CI and merge if it looks good (enough)? It’s 1:30 am here 😴 😅

nadavelkabets · 2026-04-21T06:02:45Z

CI - just linux and rhel, and just testing rclpy since those test are the only ones that failed, and they appear to have failed for a CI issue that should be fixed by ros2/ci#871 . I assume the above windows results will still be valid and don't need a re-run

Linux

Linux aarch 64

Linux RHEL (9)

Do we need to restart Linux and rhel?

christophebedard · 2026-04-21T07:05:03Z

I updated the comment

christophebedard · 2026-04-21T07:30:54Z

aarch64 looks good, and now RHEL also looks good (no complaints from pytest). ci_linux (amd64) is stuck in the queue, but I think aarch64 should be fairly representative of it, so I’m calling it good and merging

nadavelkabets force-pushed the feature/async-node branch from 6c05578 to 752fcb6 Compare March 11, 2026 20:49

nadavelkabets mentioned this pull request Mar 11, 2026

Feature: new asyncio executor #1399

Closed

nadavelkabets force-pushed the feature/async-node branch from c8e8d12 to ab678ff Compare March 11, 2026 22:37

nadavelkabets force-pushed the feature/async-node branch 2 times, most recently from 360e7c2 to 56047ef Compare March 12, 2026 22:42

This was referenced Mar 14, 2026

Refactor: base entity classes #1624

Merged

Refactor: base clock #1627

Merged

nadavelkabets commented Mar 18, 2026

View reviewed changes

This was referenced Mar 18, 2026

Feature: add AbstractExecutor and BaseExecutor #1470

Closed

Ignore exceptions raised in user created tasks #1395

Closed

Feature: streamlined entity destroy #1629

Merged

nadavelkabets mentioned this pull request Mar 28, 2026

Refactor: base node #1637

Merged

This was referenced Apr 11, 2026

Refactor: move Node initialization to BaseNode nadavelkabets/rclpy#5

Closed

Refactor: move Node initialization to BaseNode #1645

Merged

nadavelkabets marked this pull request as ready for review April 18, 2026 23:51

Feature: add native asyncio support via AsyncNode

5be1fa4

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

nadavelkabets force-pushed the feature/async-node branch from a6fb93e to 5be1fa4 Compare April 19, 2026 19:38

Fix: pipe autostart arg for timers, check if node is already destroye…

54a8b71

…d before running Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

nadavelkabets force-pushed the feature/async-node branch from 97b0281 to 54a8b71 Compare April 19, 2026 22:10

nadavelkabets added 3 commits April 19, 2026 22:59

Extend test suite

8eabb3c

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

Improve test timing and wait for dds discovery to prevent flaky behav…

06b1f11

…ior in CI Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

Improve documentation

073d96e

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

This was referenced Apr 20, 2026

Add pytest-asyncio dependency ros2/ci#868

Merged

Feature: add pytest-asyncio dependency ros2/ros2#1818

Merged

Import AsyncNode only on 3.12+

0c146b2

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

mjcarroll approved these changes Apr 20, 2026

View reviewed changes

Exclude async modules from line if python version is under 3.12

27c27c1

Signed-off-by: Nadav Elkabets <elnadav12@gmail.com>

mjcarroll approved these changes Apr 20, 2026

View reviewed changes

sloretz mentioned this pull request Apr 21, 2026

Install python3-pytest-asyncio on RHEL > 9 ros2/ci#871

Merged

christophebedard merged commit 4095493 into ros2:rolling Apr 21, 2026
3 of 4 checks passed

nadavelkabets mentioned this pull request Apr 21, 2026

Feature: asyncio node #1461

Closed

mjcarroll mentioned this pull request Apr 21, 2026

Land the Python asyncio executor implementation ros2/ros2#1731

Closed

sloretz mentioned this pull request Apr 21, 2026

Enable limited use of asyncio with executors #971

Closed

This was referenced Apr 25, 2026

Added a tutorial for writing an AsyncNode in rclpy ros2/ros2_documentation#6460

Merged

Added lyrical release entries for async node and ament_python_install_package ros2/ros2_documentation#6520

Merged

nadavelkabets mentioned this pull request May 3, 2026

Fix Windows failures in test_sleep_multiple_concurrent_waiters #1660

Open

nadavelkabets mentioned this pull request May 11, 2026

Tasks that raise an exception crash the entire executor, even when the exception is caught #1098

Open

Conversation

nadavelkabets commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage examples

Design

Core mechanism

Structured concurrency

Two entry points

Class hierarchy

Entity-owned architecture

Shutdown

Clock sleep and timers

Known limitations

Performance

CPU usage

Timer latency and jitter

Related work

Uh oh!

peci1 commented Mar 11, 2026

Uh oh!

nadavelkabets commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mjcarroll commented Mar 13, 2026

Uh oh!

nadavelkabets left a comment

Choose a reason for hiding this comment

Uh oh!

christophebedard commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christophebedard commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christophebedard commented Apr 20, 2026

Uh oh!

christophebedard commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christophebedard commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sloretz commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christophebedard commented Apr 21, 2026

Uh oh!

christophebedard commented Apr 21, 2026

Uh oh!

christophebedard commented Apr 21, 2026

Uh oh!

christophebedard commented Apr 21, 2026

Uh oh!

sloretz commented Apr 21, 2026 • edited by christophebedard Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christophebedard commented Apr 21, 2026

Uh oh!

nadavelkabets commented Apr 21, 2026

Uh oh!

christophebedard commented Apr 21, 2026

Uh oh!

christophebedard commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nadavelkabets commented Mar 11, 2026 •

edited

Loading

nadavelkabets commented Mar 11, 2026 •

edited

Loading

christophebedard commented Apr 20, 2026 •

edited

Loading

christophebedard commented Apr 20, 2026 •

edited

Loading

christophebedard commented Apr 20, 2026 •

edited

Loading

christophebedard commented Apr 20, 2026 •

edited

Loading

sloretz commented Apr 21, 2026 •

edited

Loading

sloretz commented Apr 21, 2026 •

edited by christophebedard

Loading