docs: update README with recent feature documentation#122
Conversation
Add documentation for features merged in PRs 110-121 that were missing from the README: - Error reporting (config.error_reporters) with APM integration examples - Structured JSON logging (config.log_format) with component extraction - Queue health monitoring (dashboard panel, autovacuum tuning, Prometheus metrics) - Retry backoff (exponential with jitter, per-job overrides via JobMixin) - Stream name helpers (Pgbus.stream_key, Streams::Streamable mixin) - Missing config reference entries (retry_backoff/max/jitter, streams_path, execution_mode, error_reporters, log_format, metrics_enabled, streams_stats_enabled) - Database tables (pgbus_presence_members, pgbus_stream_stats) - Updated features list and table of contents
📝 WalkthroughWalkthroughREADME updated to document new reliability and observability features: configurable exponential retry backoff with jitter, pluggable error reporting, structured logging formats, queue health monitoring/metrics, stream name helpers, and additions to features, database tables, and configuration references. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Around line 736-738: The fenced code block showing the log example uses plain
triple backticks without a language; update the opening fence from ``` to
```text so the block containing "INFO 2025-01-15T10:30:00.000Z pid=1234 tid=abc
queue=default: Starting job" has a language identifier for proper highlighting
and to satisfy markdownlint.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 99c48d2a-a58c-4b1e-b21a-50a8089520a1
📒 Files selected for processing (1)
README.md
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Line 487: The README's backoff formula currently shows "base * 2^(attempt-1) *
(1 + random_jitter)" which yields only positive jitter; update the formula to
reflect true ±jitter by using a uniform multiplier range, e.g., "base *
2^(attempt-1) * uniform(1 - jitter, 1 + jitter)" (or equivalently "base *
2^(attempt-1) * (1 + random_in_range(-jitter, jitter))") so the described "±15%
randomization" matches the documented formula and clarify the variable name
"jitter" used in the description.
- Around line 487-488: Update the README retry example to reflect the actual
default max_retries logic: the code checks read_ct > max_retries with
max_retries = 5, so a message must fail 5 times and will hit the DLQ on the 6th
read; change the sentence to: "For a job that fails 5 times with defaults: ~5s,
~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read." Reference the README
line describing the delay formula and the variables max_retries and read_ct when
making this edit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2f7d7683-2d98-4800-a7bd-5fdf0966e4cb
📒 Files selected for processing (1)
README.md
| end | ||
| ``` | ||
|
|
||
| The delay formula is `base * 2^(attempt-1) * (1 + random_jitter)`. For a job that fails 4 times with defaults: ~5s, ~10s, ~20s, ~40s before hitting DLQ on the 5th read. |
There was a problem hiding this comment.
Clarify the jitter formula to match the description.
Line 483 describes jitter as "+-15% randomization" (bidirectional), but the formula on line 487 uses (1 + random_jitter), which would produce only positive jitter (1.0 to 1.15 multiplier).
For true ±15% randomization, the formula should be something like:
base * 2^(attempt-1) * uniform(1-jitter, 1+jitter)
This would give a multiplier range of 0.85 to 1.15 when jitter=0.15, matching the "+-15%" description.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` at line 487, The README's backoff formula currently shows "base *
2^(attempt-1) * (1 + random_jitter)" which yields only positive jitter; update
the formula to reflect true ±jitter by using a uniform multiplier range, e.g.,
"base * 2^(attempt-1) * uniform(1 - jitter, 1 + jitter)" (or equivalently "base
* 2^(attempt-1) * (1 + random_in_range(-jitter, jitter))") so the described
"±15% randomization" matches the documented formula and clarify the variable
name "jitter" used in the description.
| The delay formula is `base * 2^(attempt-1) * (1 + random_jitter)`. For a job that fails 4 times with defaults: ~5s, ~10s, ~20s, ~40s before hitting DLQ on the 5th read. | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Verify the default max_retries value in configuration.rb
rg -n "max_retries.*=" lib/pgbus/configuration.rb -A 2 -B 2Repository: mhenrixon/pgbus
Length of output: 896
🏁 Script executed:
#!/bin/bash
# Search for the DLQ routing logic that uses read_ct and max_retries
rg -n "read_ct.*max_retries|max_retries.*read_ct" lib/pgbus/event_bus/ -A 3 -B 3Repository: mhenrixon/pgbus
Length of output: 41
🏁 Script executed:
#!/bin/bash
# Also search for where messages are moved to DLQ based on retry count
rg -n "dlq|dead.*letter" lib/pgbus/event_bus/ -i -A 2 -B 2Repository: mhenrixon/pgbus
Length of output: 41
🏁 Script executed:
#!/bin/bash
# List the structure of event_bus directory
find lib/pgbus/event_bus -type f -name "*.rb" | head -20Repository: mhenrixon/pgbus
Length of output: 188
🏁 Script executed:
#!/bin/bash
# Search for max_retries usage across the codebase
rg -n "max_retries" lib/pgbus/ --type rb -B 2 -A 2Repository: mhenrixon/pgbus
Length of output: 85
🏁 Script executed:
#!/bin/bash
# Search for max_retries usage across the codebase
rg -n "max_retries" lib/pgbus/ -A 2 -B 2Repository: mhenrixon/pgbus
Length of output: 3756
🏁 Script executed:
#!/bin/bash
# Search for read_ct which indicates message read count
rg -n "read_ct" lib/pgbus/ -A 2 -B 2Repository: mhenrixon/pgbus
Length of output: 15281
Update the retry failure example to match the default max_retries configuration.
The README example states "For a job that fails 4 times...before hitting DLQ on the 5th read," but the code checks read_ct > max_retries (where max_retries = 5 by default). This means a message must fail 5 times before being routed to DLQ on the 6th read.
Update the example to: "For a job that fails 5 times with defaults: ~5s, ~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 487 - 488, Update the README retry example to reflect
the actual default max_retries logic: the code checks read_ct > max_retries with
max_retries = 5, so a message must fail 5 times and will hit the DLQ on the 6th
read; change the sentence to: "For a job that fails 5 times with defaults: ~5s,
~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read." Reference the README
line describing the delay formula and the variables max_retries and read_ct when
making this edit.
Summary
Catch-up documentation for features merged in PRs 110-121 that were missing from the README.
New sections
config.error_reporterswith APM integration examples (Appsignal, Sentry)config.log_format = :jsonwith component extraction and contextRetryBackoff::JobMixinPgbus.stream_key,Streams::Streamablemixin for UUID-heavy appsConfig reference additions
retry_backoff,retry_backoff_max,retry_backoff_jitterstreams_pathexecution_modestreams_stats_enablederror_reporters,log_formatmetrics_enabledOther updates
pgbus_presence_members,pgbus_stream_statsTest plan
configuration.rbSummary by CodeRabbit