feat(process): add async/fiber execution mode for workers by mhenrixon · Pull Request #112 · mhenrixon/pgbus

mhenrixon · 2026-04-10T07:43:50Z

Summary

Adds an optional async/fiber execution mode for Pgbus workers, inspired by Solid Queue PR #728. Workers can now execute jobs as fibers on a single async reactor thread instead of a thread pool, dramatically reducing PostgreSQL connection usage for I/O-bound workloads.

ExecutionPools abstraction: Factory with ThreadPool and AsyncPool backends behind a unified interface
AsyncPool architecture: Single reactor thread with Async::Semaphore bounded fibers, Thread::Queue cross-thread inbox, boot synchronization, fatal error propagation
Configuration: Global execution_mode setting + per-worker override, :fiber alias for :async
CLI: --execution-mode async flag
Benchmarks: throughput, memory, latency, and connection pool measurements

Why it's safe

Messages stay in PGMQ with visibility timeout protection regardless of execution mode. If a fiber or worker crashes, VT expires and messages become available for re-read.

Test plan

264 specs pass (35 new + 229 updated)
Zero regressions in existing test suite
RuboCop clean
Run benchmarks to validate throughput and connection reduction claims
Manual test with pgbus start --execution-mode async

Summary by CodeRabbit

New Features
- Optional async (fiber-based) execution mode and an execution-pools abstraction (thread vs async); per-worker and global selection; runtime reports execution mode and pool metadata.
- CLI flag to set execution mode at startup.
Documentation
- README and architecture diagram updated with async mode guidance, config examples, and CLI usage.
Benchmarks & Tests
- New benchmarks and expanded specs covering pool behavior, memory, throughput, and latency.
Bug Fixes
- Listener now exits cleanly on connection closure; improved leak-test GC behavior.

Introduce an ExecutionPools abstraction (ThreadPool + AsyncPool) that allows workers to execute jobs as fibers on a single async reactor thread instead of a thread pool. This dramatically reduces PostgreSQL connection usage for I/O-bound workloads (HTTP calls, email, LLM APIs). Key changes: - ExecutionPools factory with .build(mode:, capacity:, on_state_change:) - ThreadPool: wraps Concurrent::FixedThreadPool with unified interface - AsyncPool: reactor thread + Async::Semaphore bounded fibers - Worker/Consumer refactored to use pool abstraction - Configuration: execution_mode attr with per-worker override support - CLI: --execution-mode flag - Supervisor: passes execution_mode to forked workers The async pool uses Thread::Queue as cross-thread inbox between the worker main loop and reactor, with on_state_change callback to wake the worker immediately when fibers complete. Boot synchronization via Thread::Queue ensures the reactor is ready before accepting work. PGMQ visibility timeout provides crash safety regardless of execution mode -- messages stay in PostgreSQL until archived. ## Test Coverage - ThreadPool spec: interface contract, capacity tracking, shutdown - AsyncPool spec: fiber execution, capacity bounds, shutdown, error handling, on_state_change callback, boot synchronization - Factory spec: mode routing, :fiber alias, invalid mode - Worker spec: async mode wiring, execution_mode in stats - Configuration spec: defaults, validation, per-worker override - CLI spec: --execution-mode flag ## Verification - [x] bundle exec rubocop passes (0 offenses) - [x] bundle exec rspec passes (264 new/modified specs, 0 failures) - [x] Zero regressions in existing test suite

coderabbitai · 2026-04-10T07:48:01Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a pluggable execution-pool abstraction with thread and async/fiber pool implementations, wires global and per-worker execution_mode through configuration and CLI, integrates pools into Consumer/Worker/Supervisor, and adds benchmarks, docs, and tests for the new execution modes.

Changes

Cohort / File(s)	Summary
Execution Pool Core `lib/pgbus/execution_pools.rb`, `lib/pgbus/execution_pools/thread_pool.rb`, `lib/pgbus/execution_pools/async_pool.rb`	Introduce `Pgbus::ExecutionPools` factory and two implementations (`ThreadPool`, `AsyncPool`) with unified API (`post`, `available_capacity`, `idle?`, `shutdown`, `wait_for_termination`, `kill`, `metadata`) and mode normalization.
Process Integration `lib/pgbus/process/worker.rb`, `lib/pgbus/process/consumer.rb`, `lib/pgbus/process/supervisor.rb`	Consumers and Workers accept/derive `execution_mode`, build pools via `ExecutionPools.build`, use `available_capacity`/`idle?` for polling/draining, and Supervisor passes per-worker mode and logs it.
Configuration & CLI `lib/pgbus/configuration.rb`, `lib/pgbus/cli.rb`	Add `execution_mode` config (default `:threads`), `execution_mode_for(worker_config)` for per-worker overrides with normalization/validation (`:fiber` → `:async`), and a `--execution-mode` CLI flag.
Benchmarks `benchmarks/execution_pool_bench.rb`, `benchmarks/connection_pool_bench.rb`	Add benchmark scripts comparing thread vs async pools (throughput, I/O-bound, memory, latency) and measuring peak PostgreSQL connection counts under pool load.
Documentation `CLAUDE.md`, `README.md`	Update architecture diagram to include “Execution Pools” layer; README documents async/fiber mode, prerequisites, examples, CLI flag, and behavioral guidance.
Tests / Specs `spec/pgbus/execution_pools_spec.rb`, `spec/pgbus/execution_pools/thread_pool_spec.rb`, `spec/pgbus/execution_pools/async_pool_spec.rb`, `spec/pgbus/configuration_spec.rb`, `spec/pgbus/cli_spec.rb`, `spec/pgbus/process/consumer_spec.rb`, `spec/pgbus/process/worker_spec.rb`, `spec/...`	Add comprehensive specs for pool factory/normalization, ThreadPool/AsyncPool behaviors, config validation, CLI parsing, and process integration; update test doubles to use `Pgbus::ExecutionPools`.
Web Listener `lib/pgbus/web/streamer/listener.rb`	Rescue `IOError` around `@conn.wait_for_notify` to allow clean shutdown on socket/connection closure.
Misc (memory/GC) `spec/pgbus/allocation_budget_spec.rb`	Force `GC.start` in a leak-detection spec before measuring retained allocations to avoid attributing unrelated lazy-initialized objects.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as User/CLI
    participant Supervisor as Supervisor
    participant Config as Configuration
    participant ExecPool as ExecutionPools
    participant Pool as ThreadPool/<br/>AsyncPool
    participant Worker as Worker

    CLI->>Supervisor: fork_worker(worker_config)
    Supervisor->>Config: execution_mode_for(worker_config)
    Config-->>Supervisor: :threads / :async
    Supervisor->>Worker: new(execution_mode: mode, ...)
    Worker->>ExecPool: build(mode:, capacity:, on_state_change:)
    alt mode == :threads
        ExecPool->>Pool: ThreadPool.new(capacity:)
    else mode == :async
        ExecPool->>Pool: AsyncPool.new(capacity:, on_state_change:)
    end
    Pool-->>Worker: pool instance
    Worker->>Worker: start run-loop
    loop Process messages
        Worker->>Pool: post { execute_task }
        Pool->>Pool: reserve capacity -> execute -> restore capacity
        Pool-->>Worker: on_state_change callback
    end

sequenceDiagram
    participant Consumer as Consumer
    participant Config as Configuration
    participant ExecPool as ExecutionPools
    participant Pool as ThreadPool/<br/>AsyncPool
    participant MessageBus as Message Bus

    Consumer->>Config: determine execution_mode
    Consumer->>ExecPool: build(mode:, capacity: threads)
    ExecPool-->>Consumer: pool
    Consumer->>Consumer: start poll loop
    loop Fetch & Process
        Consumer->>Pool: available_capacity
        Pool-->>Consumer: free_slots
        Consumer->>MessageBus: fetch(limit: free_slots)
        MessageBus-->>Consumer: messages
        Consumer->>Pool: post { process_message }
        Pool->>Pool: execute concurrently
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Pre-v1 hardening: remaining audit findings #98 — Implements an in-process async execution mode (ExecutionPools, AsyncPool, execution_mode wiring) matching the issue's objective.

Possibly related PRs

chore: add Claude Code commands, rules, and project config #4 — Touches CLAUDE.md architecture diagram updates; related to the same documentation area.
fix: pre-release security, correctness, and performance audit #16 — Modifies Worker internals and run-loop logic; overlaps with worker/pool integration here.
feat(streams): turbo-rails replacement — phase 3 (threaded coordinator) #75 — Changes Listener run-loop; related to the added IOError rescue in Listener#run_loop.

Suggested labels

performance

Poem

🐇 I hopped into code, quick as a wink,

Threads or fibers — pick which to link,
I juggle the slots and wake with a thrum,
Tasks dance and finish, then off they become,
A carrot of tests — now ship it, yum.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.27% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat(process): add async/fiber execution mode for workers' directly and accurately summarizes the primary change: adding async/fiber execution mode for workers as an alternative to threads.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/async-execution-pool

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 9

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmarks/connection_pool_bench.rb`:
- Around line 37-46: The count_active_connections method currently interpolates
prefix into the SQL and is vulnerable to injection; change the query to use a
parameterized call (e.g., conn.exec_params) passing the prefix value (wrapped
with '%' wildcards) as a bound parameter instead of interpolating, keeping the
same WHERE clause logic and preserving the existing PG::Error rescue behavior.
- Line 12: Remove the unused require by deleting the require "benchmark/ips"
statement so the script only loads what's needed; locate and remove the require
"benchmark/ips" line at the top of the file (there are no references to
Benchmark.ips elsewhere, so no other changes are required).

In `@benchmarks/execution_pool_bench.rb`:
- Around line 54-95: The benchmark's I/O-bound section (the Benchmark.ips block
that iterates CAPACITIES and calls build_thread_pool / build_async_pool) uses
sleep(0.01) which under MRI releases the GVL and may not demonstrate
fiber/non-blocking I/O advantages; update the I/O-bound section by adding a
clear comment above the block stating that sleep(0.01) is an approximation and
can bias results on MRI, and optionally replace the sleep-based workload with a
real non-blocking async I/O simulation (e.g., non-blocking socket or file I/O or
an async library) if you want more realistic fiber performance
comparisons—target the Benchmark.ips reports and the pool.post bodies where
sleep is invoked to make these changes.

In `@lib/pgbus/execution_pools.rb`:
- Around line 6-13: The build method in execution_pools.rb currently passes
on_state_change only to AsyncPool but not to ThreadPool, so update the
ThreadPool instantiation in build (the case for normalize_mode(mode) when
:threads) to forward the on_state_change parameter to ThreadPool.new (matching
AsyncPool.new), ensuring ThreadPool#initialize receives the callback and can
signal workers when capacity frees up.

In `@lib/pgbus/execution_pools/thread_pool.rb`:
- Around line 31-33: Document the semantics of idle? to clarify it means "has
some capacity available" (available_capacity.positive?) rather than "completely
idle"; add a brief comment above the idle? method explaining that idle? returns
true when the pool can accept more work (i.e., there are free worker slots), and
note its usage in the drain check (break if draining? && `@pool.idle`?) so future
readers understand the intended behavior.
- Around line 17-25: The post method decrements `@available_capacity` before
calling `@pool.post` but if `@pool.post` raises (e.g., pool shutdown) the decrement
is never restored because the ensure runs only inside the posted block; fix by
rescuing exceptions from `@pool.post`: after `@available_capacity.decrement`, wrap
the `@pool.post` call in a begin/rescue that on any exception increments
`@available_capacity` and calls `@on_state_change`&.call, then re-raises the
exception; keep the existing ensure inside the posted block unchanged so
capacity is also restored after normal execution.

In `@spec/pgbus/execution_pools/async_pool_spec.rb`:
- Around line 20-27: Consolidate the duplicate examples or make the "boots the
reactor thread synchronously" example assert something different: either merge
both expectations into the "starts fully available" example (keep
expect(pool.available_capacity).to eq(3) and also any other quick check) or
change the boot test to assert a boot-specific condition such as the reactor
thread being alive (e.g. reference pool.reactor_thread and assert it is alive)
or that boot completed without blocking using a Timeout wrapper; update the
example titled "boots the reactor thread synchronously" to use
pool.reactor_thread or a non-blocking assertion instead of repeating
expect(pool.available_capacity).to eq(3).

In `@spec/pgbus/execution_pools/thread_pool_spec.rb`:
- Line 95: Rubocop flagged a Style/Semicolon violation on the inline block
passed to pool.post; change the single-line block "pool.post { sleep 0.05;
result.set(:done) }" to a multi-line do...end block so each statement is on its
own line (use pool.post do ... sleep 0.05 then result.set(:done) end),
preserving the same behavior and referencing pool.post and result.set to locate
the code.
- Line 85: The line using pool.post { sleep 0.05; result.set(:done) } violates
RuboCop Style/Semicolon; replace the single-line block with a multi-line block
so the two expressions are on separate lines, e.g. use pool.post do ... end,
call sleep 0.05 on the first line and then call result.set(:done) on the next
line to keep pool.post and result.set unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 06311454-9801-4524-9648-eec3a46e3996

📥 Commits

Reviewing files that changed from the base of the PR and between d1c98ce and 477bb88.

📒 Files selected for processing (19)

CLAUDE.md
README.md
benchmarks/connection_pool_bench.rb
benchmarks/execution_pool_bench.rb
lib/pgbus/cli.rb
lib/pgbus/configuration.rb
lib/pgbus/execution_pools.rb
lib/pgbus/execution_pools/async_pool.rb
lib/pgbus/execution_pools/thread_pool.rb
lib/pgbus/process/consumer.rb
lib/pgbus/process/supervisor.rb
lib/pgbus/process/worker.rb
spec/pgbus/cli_spec.rb
spec/pgbus/configuration_spec.rb
spec/pgbus/execution_pools/async_pool_spec.rb
spec/pgbus/execution_pools/thread_pool_spec.rb
spec/pgbus/execution_pools_spec.rb
spec/pgbus/process/consumer_spec.rb
spec/pgbus/process/worker_spec.rb

- Fix RSpec/RepeatedExample in async_pool_spec (distinct assertion) - Fix Style/Semicolon in thread_pool_spec (multi-line blocks) - Rename github-ci-failures command to github-review-failures for consistency with github-review-comments

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@spec/pgbus/execution_pools/async_pool_spec.rb`:
- Line 57: Replace the flaky fixed waits (the occurrences of "sleep 0.05" in the
async_pool_spec) with event-driven synchronization: have the worker thread(s)
signal test progress via a Thread::Queue (or a Monitor + ConditionVariable /
simple CountDownLatch), then in the spec replace each sleep 0.05 with a
queue.pop(timeout) (or condition wait with timeout) that fails the test on
timeout; apply this change to every "sleep 0.05" instance so tests wait for
explicit signals from the code under test rather than fixed sleeps.

In `@spec/pgbus/execution_pools/thread_pool_spec.rb`:
- Around line 22-43: Add a spec that verifies ThreadPool#post triggers the
on_state_change callback: arrange a spy or proc and assign it to the pool's
`@on_state_change` (or use pool.instance_variable_set(:`@on_state_change`,
callback)), post a job that completes (e.g., sets an IVar or flips a flag), and
assert the callback was invoked (use a Concurrent::IVar or a simple boolean
captured by the proc) after the job runs; target the ThreadPool#post behavior
and the `@on_state_change` invocation to ensure wake-up/re-scheduling contract is
covered.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: b3ce1ffe-6233-43d7-bcb8-6468e21f3298

📥 Commits

Reviewing files that changed from the base of the PR and between 477bb88 and 24d5667.

📒 Files selected for processing (3)

.claude/commands/github-review-failures.md
spec/pgbus/execution_pools/async_pool_spec.rb
spec/pgbus/execution_pools/thread_pool_spec.rb

- Pass on_state_change callback to ThreadPool in factory (was only passed to AsyncPool — workers in thread mode missed wake signals) - Guard ThreadPool#post against capacity leak if pool.post raises - Remove unused benchmark/ips require from connection_pool_bench - Use exec_params in connection_pool_bench to avoid SQL interpolation - Add on_state_change callback test for ThreadPool

When Listener#stop closes the PG connection to interrupt wait_for_notify, Ruby raises IOError ("stream closed in another thread") rather than PG::Error. The rescue clause only caught PG::Error, leaving IOError unhandled and dumping a stack trace to stderr on every streamer shutdown. Catch IOError before PG::Error since it indicates an expected shutdown — the connection was deliberately closed by #stop.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/pgbus/web/streamer/listener.rb`:
- Around line 116-120: The rescue for IOError currently only breaks when
`@running` is false and otherwise silently continues; change it so that when an
IOError occurs while `@running` is true it performs the same logging and reconnect
flow as the existing PG::Error handler (lines 22–26) instead of silently
continuing — i.e., log the IOError with context (using the same logger used for
PG::Error) and invoke the same reconnect/cleanup routine (the listener's
reconnect/connection-reset logic) so the listener recovers; keep the existing
break behavior when `@running` is false.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: aff46f95-be4c-403a-b31e-72d13f9f0697

📥 Commits

Reviewing files that changed from the base of the PR and between 792877a and 1d15fc7.

📒 Files selected for processing (1)

lib/pgbus/web/streamer/listener.rb

MemoryProfiler on Ruby 3.3 picks up lazily-initialized globals from gems loaded by other specs in the same process (e.g. async's fiber scheduler hooks). Force GC.start before the report so these objects are collected and not attributed to send_message calls. This consistently failed on CI Ruby 3.3 (retained=5) while passing on 3.4 and 4.0 where GC is more aggressive about collecting fiber-related global state.

When IOError fires while the listener is still running (not from #stop shutdown), log the error and reconnect — same as the PG::Error path. Without this, the loop would continue trying wait_for_notify on a dead connection.

Replace GC.start workaround with MemoryProfiler's allow_files filter. This scopes retained object tracking to lib/pgbus/ so external gem globals (async fiber hooks, connection_pool singletons, JSON parser caches) loaded by other specs in the same process are excluded. The test now validates that Pgbus's own send_message path retains zero objects — regardless of what other gems do globally.

coderabbitai Bot requested changes Apr 10, 2026

View reviewed changes

mhenrixon self-assigned this Apr 10, 2026

mhenrixon added the enhancement New feature or request label Apr 10, 2026

coderabbitai Bot requested changes Apr 10, 2026

View reviewed changes

Comment thread spec/pgbus/execution_pools/async_pool_spec.rb

Comment thread spec/pgbus/execution_pools/thread_pool_spec.rb

coderabbitai Bot approved these changes Apr 10, 2026

View reviewed changes

coderabbitai Bot requested changes Apr 10, 2026

View reviewed changes

Comment thread lib/pgbus/web/streamer/listener.rb Outdated

mhenrixon added 2 commits April 10, 2026 10:48

coderabbitai Bot approved these changes Apr 10, 2026

View reviewed changes

mhenrixon enabled auto-merge (squash) April 10, 2026 08:56

mhenrixon merged commit ee37ad6 into main Apr 10, 2026
9 checks passed

mhenrixon deleted the feature/async-execution-pool branch April 10, 2026 08:59

This was referenced Apr 10, 2026

feat(config): auto-tune pool size for async workers #114

Merged

feat: add error reporters and JSON log formatter #121

Merged

mhenrixon mentioned this pull request Apr 11, 2026

docs: update README with recent feature documentation #122

Merged

3 tasks

Uh oh!

Conversation

mhenrixon commented Apr 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why it's safe

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mhenrixon commented Apr 10, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 10, 2026 •

edited

Loading