docs: update README with recent feature documentation by mhenrixon · Pull Request #122 · mhenrixon/pgbus

mhenrixon · 2026-04-11T12:09:43Z

Summary

Catch-up documentation for features merged in PRs 110-121 that were missing from the README.

New sections

Observability — new top-level section with:
- Error reporting — config.error_reporters with APM integration examples (Appsignal, Sentry)
- Structured logging — config.log_format = :json with component extraction and context
- Queue health monitoring — dashboard panel, autovacuum tuning generator/rake task, 8 Prometheus metrics
Retry backoff — under Reliability, exponential backoff with jitter, per-job overrides via RetryBackoff::JobMixin
Stream name helpers — under Real-time broadcasts, Pgbus.stream_key, Streams::Streamable mixin for UUID-heavy apps

Config reference additions

Config	PR
`retry_backoff`, `retry_backoff_max`, `retry_backoff_jitter`	#110
`streams_path`	#111
`execution_mode`	#112
`streams_stats_enabled`	#117
`error_reporters`, `log_format`	#121
`metrics_enabled`	#117

Other updates

Features list: retry backoff, error reporting, structured logging, queue health
TOC: new Observability section, Retry backoff under Reliability
Database tables: pgbus_presence_members, pgbus_stream_stats

Test plan

Verify all TOC links resolve to correct anchors
Verify config reference entries match actual defaults in configuration.rb
Verify code examples are syntactically correct

Summary by CodeRabbit

Documentation
- Added retry backoff guidance describing configurable exponential backoff with jitter and per-job overrides
- Expanded observability docs: error reporting hooks, structured (JSON/text) logging with context, and optional metrics
- Documented queue health monitoring and real-time stream stats/monitoring
- Described stream naming helpers and handling when names exceed queue-length limits
- Added entries for presence and stream-stats tables and updated configuration reference

Add documentation for features merged in PRs 110-121 that were missing from the README: - Error reporting (config.error_reporters) with APM integration examples - Structured JSON logging (config.log_format) with component extraction - Queue health monitoring (dashboard panel, autovacuum tuning, Prometheus metrics) - Retry backoff (exponential with jitter, per-job overrides via JobMixin) - Stream name helpers (Pgbus.stream_key, Streams::Streamable mixin) - Missing config reference entries (retry_backoff/max/jitter, streams_path, execution_mode, error_reporters, log_format, metrics_enabled, streams_stats_enabled) - Database tables (pgbus_presence_members, pgbus_stream_stats) - Updated features list and table of contents

coderabbitai · 2026-04-11T12:09:54Z

📝 Walkthrough

Walkthrough

README updated to document new reliability and observability features: configurable exponential retry backoff with jitter, pluggable error reporting, structured logging formats, queue health monitoring/metrics, stream name helpers, and additions to features, database tables, and configuration references.

Changes

Cohort / File(s)	Summary
Documentation `README.md`	Added "Retry backoff" (global and per-job backoff, jitter, `Pgbus::RetryBackoff::JobMixin`/`pgbus_retry_backoff`), "Observability" (error reporters callable receiving `(exception, context_hash)`, structured JSON/text log formats via `config.log_format`, optional Prometheus metrics `config.metrics_enabled`), "Stream name helpers" (`Pgbus::Streams::Streamable`, `short_id`/`to_stream_key`, `Pgbus.stream_key`, `Pgbus::Streams::StreamNameTooLong`), expanded Features/Database tables/Configuration reference and added `pgbus_presence_members` and `pgbus_stream_stats` tables.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

feat(retry): exponential backoff for VT-based retries #110: Introduces the retry-backoff implementation and JobMixin/pgbus_retry_backoff DSL documented in this README update.
#86: README restructuring into themed sections (Reliability/Observability) that these additions follow or complement.

Poem

🐇 I nibble docs with eager paws,

Backoffs, logs, and metrics—no lost cause.
Streams named tidy, errors shown bright,
Queues hum steady through the night.
Hooray — observability takes flight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'docs: update README with recent feature documentation' is generic and uses non-specific language. While it mentions README updates, it lacks detail about what specific features or improvements are documented, making it difficult to understand the scope of changes from the title alone.	Consider a more specific title that highlights the primary features documented, such as 'docs: add observability, retry backoff, and stream helpers to README' to better convey the significance and scope of the documentation updates.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/readme-catchup

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Around line 736-738: The fenced code block showing the log example uses plain
triple backticks without a language; update the opening fence from ``` to
```text so the block containing "INFO 2025-01-15T10:30:00.000Z pid=1234 tid=abc
queue=default: Starting job" has a language identifier for proper highlighting
and to satisfy markdownlint.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 99c48d2a-a58c-4b1e-b21a-50a8089520a1

📥 Commits

Reviewing files that changed from the base of the PR and between 6c8f847 and 68fc79f.

📒 Files selected for processing (1)

README.md

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Line 487: The README's backoff formula currently shows "base * 2^(attempt-1) *
(1 + random_jitter)" which yields only positive jitter; update the formula to
reflect true ±jitter by using a uniform multiplier range, e.g., "base *
2^(attempt-1) * uniform(1 - jitter, 1 + jitter)" (or equivalently "base *
2^(attempt-1) * (1 + random_in_range(-jitter, jitter))") so the described "±15%
randomization" matches the documented formula and clarify the variable name
"jitter" used in the description.
- Around line 487-488: Update the README retry example to reflect the actual
default max_retries logic: the code checks read_ct > max_retries with
max_retries = 5, so a message must fail 5 times and will hit the DLQ on the 6th
read; change the sentence to: "For a job that fails 5 times with defaults: ~5s,
~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read." Reference the README
line describing the delay formula and the variables max_retries and read_ct when
making this edit.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2f7d7683-2d98-4800-a7bd-5fdf0966e4cb

📥 Commits

Reviewing files that changed from the base of the PR and between 68fc79f and e977c2b.

📒 Files selected for processing (1)

README.md

coderabbitai · 2026-04-11T14:26:02Z

+end
+```
+
+The delay formula is `base * 2^(attempt-1) * (1 + random_jitter)`. For a job that fails 4 times with defaults: ~5s, ~10s, ~20s, ~40s before hitting DLQ on the 5th read.


⚠️ Potential issue | 🟡 Minor

Clarify the jitter formula to match the description.

Line 483 describes jitter as "+-15% randomization" (bidirectional), but the formula on line 487 uses (1 + random_jitter), which would produce only positive jitter (1.0 to 1.15 multiplier).

For true ±15% randomization, the formula should be something like:

base * 2^(attempt-1) * uniform(1-jitter, 1+jitter)

This would give a multiplier range of 0.85 to 1.15 when jitter=0.15, matching the "+-15%" description.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@README.md` at line 487, The README's backoff formula currently shows "base * 2^(attempt-1) * (1 + random_jitter)" which yields only positive jitter; update the formula to reflect true ±jitter by using a uniform multiplier range, e.g., "base * 2^(attempt-1) * uniform(1 - jitter, 1 + jitter)" (or equivalently "base * 2^(attempt-1) * (1 + random_in_range(-jitter, jitter))") so the described "±15% randomization" matches the documented formula and clarify the variable name "jitter" used in the description.

coderabbitai · 2026-04-11T14:26:02Z

+The delay formula is `base * 2^(attempt-1) * (1 + random_jitter)`. For a job that fails 4 times with defaults: ~5s, ~10s, ~20s, ~40s before hitting DLQ on the 5th read.
+


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Verify the default max_retries value in configuration.rb rg -n "max_retries.*=" lib/pgbus/configuration.rb -A 2 -B 2

Repository: mhenrixon/pgbus

Length of output: 896

🏁 Script executed:

#!/bin/bash # Search for the DLQ routing logic that uses read_ct and max_retries rg -n "read_ct.*max_retries|max_retries.*read_ct" lib/pgbus/event_bus/ -A 3 -B 3

Repository: mhenrixon/pgbus

Length of output: 41

🏁 Script executed:

#!/bin/bash # Also search for where messages are moved to DLQ based on retry count rg -n "dlq|dead.*letter" lib/pgbus/event_bus/ -i -A 2 -B 2

Repository: mhenrixon/pgbus

Length of output: 41

🏁 Script executed:

#!/bin/bash # List the structure of event_bus directory find lib/pgbus/event_bus -type f -name "*.rb" | head -20

Repository: mhenrixon/pgbus

Length of output: 188

🏁 Script executed:

#!/bin/bash # Search for max_retries usage across the codebase rg -n "max_retries" lib/pgbus/ --type rb -B 2 -A 2

Repository: mhenrixon/pgbus

Length of output: 85

🏁 Script executed:

#!/bin/bash # Search for max_retries usage across the codebase rg -n "max_retries" lib/pgbus/ -A 2 -B 2

Repository: mhenrixon/pgbus

Length of output: 3756

🏁 Script executed:

#!/bin/bash # Search for read_ct which indicates message read count rg -n "read_ct" lib/pgbus/ -A 2 -B 2

Repository: mhenrixon/pgbus

Length of output: 15281

Update the retry failure example to match the default max_retries configuration.

The README example states "For a job that fails 4 times...before hitting DLQ on the 5th read," but the code checks read_ct > max_retries (where max_retries = 5 by default). This means a message must fail 5 times before being routed to DLQ on the 6th read.

Update the example to: "For a job that fails 5 times with defaults: ~5s, ~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 487 - 488, Update the README retry example to reflect the actual default max_retries logic: the code checks read_ct > max_retries with max_retries = 5, so a message must fail 5 times and will hit the DLQ on the 6th read; change the sentence to: "For a job that fails 5 times with defaults: ~5s, ~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read." Reference the README line describing the delay formula and the variables max_retries and read_ct when making this edit.

mhenrixon self-assigned this Apr 11, 2026

mhenrixon added the documentation Improvements or additions to documentation label Apr 11, 2026

mhenrixon enabled auto-merge (squash) April 11, 2026 12:10

coderabbitai Bot requested changes Apr 11, 2026

View reviewed changes

Comment thread README.md Outdated

docs: add language identifier to fenced code block

e977c2b

coderabbitai Bot requested changes Apr 11, 2026

View reviewed changes

mhenrixon disabled auto-merge April 11, 2026 14:32

mhenrixon merged commit 9407d26 into main Apr 11, 2026
9 checks passed

mhenrixon deleted the docs/readme-catchup branch April 11, 2026 14:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: update README with recent feature documentation#122

docs: update README with recent feature documentation#122
mhenrixon merged 2 commits into
mainfrom
docs/readme-catchup

mhenrixon commented Apr 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 11, 2026

Uh oh!

coderabbitai Bot Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		The delay formula is `base * 2^(attempt-1) * (1 + random_jitter)`. For a job that fails 4 times with defaults: ~5s, ~10s, ~20s, ~40s before hitting DLQ on the 5th read.

Uh oh!

Conversation

mhenrixon commented Apr 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New sections

Config reference additions

Other updates

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mhenrixon commented Apr 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 11, 2026 •

edited

Loading