Skip to content

docs: update README with recent feature documentation#122

Merged
mhenrixon merged 2 commits into
mainfrom
docs/readme-catchup
Apr 11, 2026
Merged

docs: update README with recent feature documentation#122
mhenrixon merged 2 commits into
mainfrom
docs/readme-catchup

Conversation

@mhenrixon
Copy link
Copy Markdown
Owner

@mhenrixon mhenrixon commented Apr 11, 2026

Summary

Catch-up documentation for features merged in PRs 110-121 that were missing from the README.

New sections

  • Observability — new top-level section with:
    • Error reportingconfig.error_reporters with APM integration examples (Appsignal, Sentry)
    • Structured loggingconfig.log_format = :json with component extraction and context
    • Queue health monitoring — dashboard panel, autovacuum tuning generator/rake task, 8 Prometheus metrics
  • Retry backoff — under Reliability, exponential backoff with jitter, per-job overrides via RetryBackoff::JobMixin
  • Stream name helpers — under Real-time broadcasts, Pgbus.stream_key, Streams::Streamable mixin for UUID-heavy apps

Config reference additions

Config PR
retry_backoff, retry_backoff_max, retry_backoff_jitter #110
streams_path #111
execution_mode #112
streams_stats_enabled #117
error_reporters, log_format #121
metrics_enabled #117

Other updates

  • Features list: retry backoff, error reporting, structured logging, queue health
  • TOC: new Observability section, Retry backoff under Reliability
  • Database tables: pgbus_presence_members, pgbus_stream_stats

Test plan

  • Verify all TOC links resolve to correct anchors
  • Verify config reference entries match actual defaults in configuration.rb
  • Verify code examples are syntactically correct

Summary by CodeRabbit

  • Documentation
    • Added retry backoff guidance describing configurable exponential backoff with jitter and per-job overrides
    • Expanded observability docs: error reporting hooks, structured (JSON/text) logging with context, and optional metrics
    • Documented queue health monitoring and real-time stream stats/monitoring
    • Described stream naming helpers and handling when names exceed queue-length limits
    • Added entries for presence and stream-stats tables and updated configuration reference

Add documentation for features merged in PRs 110-121 that were
missing from the README:

- Error reporting (config.error_reporters) with APM integration examples
- Structured JSON logging (config.log_format) with component extraction
- Queue health monitoring (dashboard panel, autovacuum tuning, Prometheus metrics)
- Retry backoff (exponential with jitter, per-job overrides via JobMixin)
- Stream name helpers (Pgbus.stream_key, Streams::Streamable mixin)
- Missing config reference entries (retry_backoff/max/jitter, streams_path,
  execution_mode, error_reporters, log_format, metrics_enabled,
  streams_stats_enabled)
- Database tables (pgbus_presence_members, pgbus_stream_stats)
- Updated features list and table of contents
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 11, 2026

📝 Walkthrough

Walkthrough

README updated to document new reliability and observability features: configurable exponential retry backoff with jitter, pluggable error reporting, structured logging formats, queue health monitoring/metrics, stream name helpers, and additions to features, database tables, and configuration references.

Changes

Cohort / File(s) Summary
Documentation
README.md
Added "Retry backoff" (global and per-job backoff, jitter, Pgbus::RetryBackoff::JobMixin/pgbus_retry_backoff), "Observability" (error reporters callable receiving (exception, context_hash), structured JSON/text log formats via config.log_format, optional Prometheus metrics config.metrics_enabled), "Stream name helpers" (Pgbus::Streams::Streamable, short_id/to_stream_key, Pgbus.stream_key, Pgbus::Streams::StreamNameTooLong), expanded Features/Database tables/Configuration reference and added pgbus_presence_members and pgbus_stream_stats tables.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

  • feat(retry): exponential backoff for VT-based retries #110: Introduces the retry-backoff implementation and JobMixin/pgbus_retry_backoff DSL documented in this README update.
  • #86: README restructuring into themed sections (Reliability/Observability) that these additions follow or complement.

Poem

🐇 I nibble docs with eager paws,

Backoffs, logs, and metrics—no lost cause.
Streams named tidy, errors shown bright,
Queues hum steady through the night.
Hooray — observability takes flight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'docs: update README with recent feature documentation' is generic and uses non-specific language. While it mentions README updates, it lacks detail about what specific features or improvements are documented, making it difficult to understand the scope of changes from the title alone. Consider a more specific title that highlights the primary features documented, such as 'docs: add observability, retry backoff, and stream helpers to README' to better convey the significance and scope of the documentation updates.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/readme-catchup

Comment @coderabbitai help to get the list of available commands and usage tips.

@mhenrixon mhenrixon self-assigned this Apr 11, 2026
@mhenrixon mhenrixon added the documentation Improvements or additions to documentation label Apr 11, 2026
@mhenrixon mhenrixon enabled auto-merge (squash) April 11, 2026 12:10
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Around line 736-738: The fenced code block showing the log example uses plain
triple backticks without a language; update the opening fence from ``` to
```text so the block containing "INFO 2025-01-15T10:30:00.000Z pid=1234 tid=abc
queue=default: Starting job" has a language identifier for proper highlighting
and to satisfy markdownlint.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 99c48d2a-a58c-4b1e-b21a-50a8089520a1

📥 Commits

Reviewing files that changed from the base of the PR and between 6c8f847 and 68fc79f.

📒 Files selected for processing (1)
  • README.md

Comment thread README.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Line 487: The README's backoff formula currently shows "base * 2^(attempt-1) *
(1 + random_jitter)" which yields only positive jitter; update the formula to
reflect true ±jitter by using a uniform multiplier range, e.g., "base *
2^(attempt-1) * uniform(1 - jitter, 1 + jitter)" (or equivalently "base *
2^(attempt-1) * (1 + random_in_range(-jitter, jitter))") so the described "±15%
randomization" matches the documented formula and clarify the variable name
"jitter" used in the description.
- Around line 487-488: Update the README retry example to reflect the actual
default max_retries logic: the code checks read_ct > max_retries with
max_retries = 5, so a message must fail 5 times and will hit the DLQ on the 6th
read; change the sentence to: "For a job that fails 5 times with defaults: ~5s,
~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read." Reference the README
line describing the delay formula and the variables max_retries and read_ct when
making this edit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2f7d7683-2d98-4800-a7bd-5fdf0966e4cb

📥 Commits

Reviewing files that changed from the base of the PR and between 68fc79f and e977c2b.

📒 Files selected for processing (1)
  • README.md

Comment thread README.md
end
```

The delay formula is `base * 2^(attempt-1) * (1 + random_jitter)`. For a job that fails 4 times with defaults: ~5s, ~10s, ~20s, ~40s before hitting DLQ on the 5th read.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the jitter formula to match the description.

Line 483 describes jitter as "+-15% randomization" (bidirectional), but the formula on line 487 uses (1 + random_jitter), which would produce only positive jitter (1.0 to 1.15 multiplier).

For true ±15% randomization, the formula should be something like:

base * 2^(attempt-1) * uniform(1-jitter, 1+jitter)

This would give a multiplier range of 0.85 to 1.15 when jitter=0.15, matching the "+-15%" description.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` at line 487, The README's backoff formula currently shows "base *
2^(attempt-1) * (1 + random_jitter)" which yields only positive jitter; update
the formula to reflect true ±jitter by using a uniform multiplier range, e.g.,
"base * 2^(attempt-1) * uniform(1 - jitter, 1 + jitter)" (or equivalently "base
* 2^(attempt-1) * (1 + random_in_range(-jitter, jitter))") so the described
"±15% randomization" matches the documented formula and clarify the variable
name "jitter" used in the description.

Comment thread README.md
Comment on lines +487 to +488
The delay formula is `base * 2^(attempt-1) * (1 + random_jitter)`. For a job that fails 4 times with defaults: ~5s, ~10s, ~20s, ~40s before hitting DLQ on the 5th read.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify the default max_retries value in configuration.rb

rg -n "max_retries.*=" lib/pgbus/configuration.rb -A 2 -B 2

Repository: mhenrixon/pgbus

Length of output: 896


🏁 Script executed:

#!/bin/bash
# Search for the DLQ routing logic that uses read_ct and max_retries
rg -n "read_ct.*max_retries|max_retries.*read_ct" lib/pgbus/event_bus/ -A 3 -B 3

Repository: mhenrixon/pgbus

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Also search for where messages are moved to DLQ based on retry count
rg -n "dlq|dead.*letter" lib/pgbus/event_bus/ -i -A 2 -B 2

Repository: mhenrixon/pgbus

Length of output: 41


🏁 Script executed:

#!/bin/bash
# List the structure of event_bus directory
find lib/pgbus/event_bus -type f -name "*.rb" | head -20

Repository: mhenrixon/pgbus

Length of output: 188


🏁 Script executed:

#!/bin/bash
# Search for max_retries usage across the codebase
rg -n "max_retries" lib/pgbus/ --type rb -B 2 -A 2

Repository: mhenrixon/pgbus

Length of output: 85


🏁 Script executed:

#!/bin/bash
# Search for max_retries usage across the codebase
rg -n "max_retries" lib/pgbus/ -A 2 -B 2

Repository: mhenrixon/pgbus

Length of output: 3756


🏁 Script executed:

#!/bin/bash
# Search for read_ct which indicates message read count
rg -n "read_ct" lib/pgbus/ -A 2 -B 2

Repository: mhenrixon/pgbus

Length of output: 15281


Update the retry failure example to match the default max_retries configuration.

The README example states "For a job that fails 4 times...before hitting DLQ on the 5th read," but the code checks read_ct > max_retries (where max_retries = 5 by default). This means a message must fail 5 times before being routed to DLQ on the 6th read.

Update the example to: "For a job that fails 5 times with defaults: ~5s, ~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 487 - 488, Update the README retry example to reflect
the actual default max_retries logic: the code checks read_ct > max_retries with
max_retries = 5, so a message must fail 5 times and will hit the DLQ on the 6th
read; change the sentence to: "For a job that fails 5 times with defaults: ~5s,
~10s, ~20s, ~40s, ~80s before hitting DLQ on the 6th read." Reference the README
line describing the delay formula and the variables max_retries and read_ct when
making this edit.

@mhenrixon mhenrixon disabled auto-merge April 11, 2026 14:32
@mhenrixon mhenrixon merged commit 9407d26 into main Apr 11, 2026
9 checks passed
@mhenrixon mhenrixon deleted the docs/readme-catchup branch April 11, 2026 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant