perf(compression): cache pre-compiled Dictionary across block decompress calls by polaz · Pull Request #227 · structured-world/coordinode-lsm-tree

polaz · 2026-04-06T21:47:42Z

Summary

C FFI backend: DecoderDictionary<'static> (wraps ZSTD_DDict) is now cached in ZstdDictionary via Arc<OnceLock<...>> — parsed once per process, shared across all clones of the same dictionary handle, zero re-parsing on subsequent blocks
Pure Rust backend: FrameDecoder with dictionary pre-loaded is cached in thread-local storage keyed by dict_id — parsed once per thread, no mutex needed (FrameDecoder is !Send)
Correctness fix: latent bug in pure Rust decompress_with_dict — was calling init(data) on a Copy slice (only read the frame header; decode buffer stayed empty, always returning Ok([])); replaced with decode_all_to_vec(&mut input) which fully decodes the frame

Changes

File	Change
`src/compression/mod.rs`	Add `prepared: Arc<OnceLock<DecoderDictionary<'static>>>` to `ZstdDictionary`; add `decoder_dict()` accessor; change `decompress_with_dict` signature to take `&ZstdDictionary`
`src/compression/zstd_ffi.rs`	Use `Decompressor::with_prepared_dictionary(dict.decoder_dict())` — no more per-call `ZSTD_createDDict`
`src/compression/zstd_pure.rs`	TLS-cached `FrameDecoder`; fix correctness bug; add unit tests with pre-generated test vectors
`src/table/block/mod.rs`	Update 4 `decompress_with_dict` call sites to pass `&dict` instead of `dict.raw()`
`benches/zstd_dict.rs`	New: warm/cold per-block latency benchmarks

Test Plan

cargo clippy --features zstd --all-targets -- -D warnings — clean
cargo clippy --features zstd-pure --all-targets -- -D warnings — clean
cargo nextest run --features zstd --workspace — 1168/1168 passed
cargo nextest run --features zstd-pure --workspace — 1157/1157 passed
cargo test --doc --workspace — 41/41 passed
cargo build --bench zstd_dict --features zstd — compiles
cargo build --bench zstd_dict --features zstd-pure — compiles

Closes #217

Summary by CodeRabbit

Tests
- Added a benchmark to measure decompression performance using zstd dictionaries.
Refactor
- Improved compression API to use dictionary objects and enable internal dictionary caching for better decompression efficiency.
- Compression module is now hidden from generated public documentation.

…ess calls - C FFI backend: cache `DecoderDictionary<'static>` (ZSTD_DDict) in `ZstdDictionary` via `Arc<OnceLock<...>>` — parsed once per process, shared across all clones of the same dictionary handle - Pure Rust backend: cache `FrameDecoder` with dictionary pre-loaded in thread-local storage keyed by dict ID — parsed once per thread - Fix latent correctness bug in pure Rust `decompress_with_dict`: was calling `init(data)` on a Copy slice (reads frame header only, output buffer stays empty); replace with `decode_all_to_vec` which takes `&mut input` and fully decodes the frame - Change `CompressionProvider::decompress_with_dict` signature from `dict_raw: &[u8]` to `dict: &ZstdDictionary` to give backends access to the cached prepared form; update all four call sites in block/mod.rs - Add `ZstdDictionary::decoder_dict()` — lazily initialises ZSTD_DDict via `OnceLock::get_or_init` (C FFI only) - Add unit tests for pure Rust backend with pre-generated test vectors (decompress + idempotent repeated calls exercising TLS cache path) - Add `benches/zstd_dict.rs` with warm / cold per-block latency benchmarks - Expose `#[doc(hidden)] pub mod compression` so benchmarks can reach `CompressionProvider` and `ZstdBackend` type alias Closes #217

coderabbitai · 2026-04-06T21:48:42Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: dadc0493-7596-44bb-b041-055e41a7e755

📥 Commits

Reviewing files that changed from the base of the PR and between e0d4113 and 76fb725.

📒 Files selected for processing (7)

Cargo.toml
benches/zstd_dict.rs
src/compression/mod.rs
src/compression/zstd_ffi.rs
src/compression/zstd_pure.rs
src/lib.rs
src/table/block/mod.rs

📝 Walkthrough

Walkthrough

Refactors zstd dictionary handling to pass a ZstdDictionary through the compression API, add a lazily-initialized prepared dictionary cache, update provider implementations and call sites to use the cached prepared dictionary, and add a Criterion benchmark measuring warm vs cold dictionary decompression.

Changes

Cohort / File(s)	Summary
Benchmark Infrastructure `Cargo.toml`, `benches/zstd_dict.rs`	Add a new `zstd_dict` Criterion benchmark measuring warm (cached) and cold (fresh) dictionary decompressions.
Compression Trait & Dictionary Type `src/compression/mod.rs`	Change `CompressionProvider::decompress_with_dict` to accept `&ZstdDictionary`; expand `ZstdDictionary` id from `u32`→`u64`; add feature-gated `prepared: Arc<OnceLock<...>>`, `decoder_dict()` accessor, and a manual `Clone`; adjust Debug formatting.
Zstd Provider Implementations `src/compression/zstd_ffi.rs`, `src/compression/zstd_pure.rs`	Providers now accept `&ZstdDictionary` and use prepared/cached decoder dictionaries; `zstd_pure` adds a thread-local `FrameDecoder` cache keyed by dict id and new error handling for oversized output; tests added for pure implementation.
Block Decompression Call Sites `src/table/block/mod.rs`	All `CompressionType::ZstdDict` call sites updated to pass `ZstdDictionary` references instead of raw bytes in both encrypted and unencrypted paths.
Module Visibility `src/lib.rs`	Mark `compression` module `#[doc(hidden)]` while keeping it public.

Sequence Diagram(s)

sequenceDiagram
    participant BlockReader as Block Reader
    participant ZstdDict as ZstdDictionary
    participant Cache as Prepared Dict Cache
    participant Provider as Compression Provider

    BlockReader->>ZstdDict: request decoder_dict()
    ZstdDict->>Cache: get_or_init()
    alt cache miss
        Cache->>Cache: prepare DecoderDictionary
        Cache-->>ZstdDict: return prepared dict (cached)
    else cache hit
        Cache-->>ZstdDict: return prepared dict
    end
    BlockReader->>Provider: decompress_with_dict(data, &ZstdDict)
    Provider->>ZstdDict: decoder_dict() (use cached prepared dict)
    Provider-->>BlockReader: decompressed data

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(compression): zstd dictionary compression support #131: Modifies the zstd dictionary API and threads ZstdDictionary through providers — closely related to the same API surface changes.
feat(compression): CompressionProvider trait + pure Rust zstd backend #176: Changes CompressionProvider/decompress_with_dict and zstd provider implementations — overlaps with provider-side adaptations and caching.

Poem

🐰 I nibble at dictionaries prepared and neat,

OnceLock keeps them warm, no repeat,
Blocks unwind their zstd thread,
Fast hops forward — no more dread,
A cached crunch, my carrot treat!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly summarizes the main change: caching pre-compiled Zstandard dictionaries to improve decompression performance by avoiding repeated parsing.
Linked Issues check	✅ Passed	The PR fully addresses issue `#217` objectives: extends decompress_with_dict to use stateful ZstdDictionary handles, caches prepared dictionaries avoiding per-call decode_dict() overhead, and includes benchmarks measuring per-block latency improvements.
Out of Scope Changes check	✅ Passed	All changes directly support the core objective of caching pre-compiled dictionaries. The hidden documentation module change is a minor organizational improvement. No unrelated changes detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/#217-perfcompression-cache-pre-compiled-dictionary-acro

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-04-06T21:50:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copilot

Pull request overview

This PR improves zstd dictionary decompression performance by caching prepared dictionary state across block decompress calls (FFI backend via OnceLock, pure-Rust backend via TLS), and fixes a correctness bug in the pure-Rust dictionary decompression path. It also adds a Criterion benchmark to measure warm vs cold dictionary decompress latency.

Changes:

Cache pre-compiled zstd dictionaries: ZSTD_DDict (FFI) and a TLS FrameDecoder (pure Rust).
Update zstd dictionary decompression API to take &ZstdDictionary and adjust call sites.
Add a new zstd_dict benchmark for per-block dict decompression latency.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`src/compression/mod.rs`	Extends `ZstdDictionary` with a lazily initialized prepared dictionary cache (FFI) and updates `decompress_with_dict` to take `&ZstdDictionary`.
`src/compression/zstd_ffi.rs`	Switches to `with_prepared_dictionary(dict.decoder_dict())` to avoid per-call `ZSTD_createDDict`.
`src/compression/zstd_pure.rs`	Adds TLS caching for dict decompression and changes the decode path to fully decode frames.
`src/table/block/mod.rs`	Updates zstd dict decompression call sites to pass `&ZstdDictionary` instead of raw bytes.
`src/lib.rs`	Makes the `compression` module public (hidden from docs) to support benchmark access.
`Cargo.toml`	Registers the new `zstd_dict` benchmark target.
`benches/zstd_dict.rs`	Adds warm/cold Criterion benchmark for dictionary decompression latency.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benches/zstd_dict.rs`:
- Around line 18-20: The unconditional import of
lsm_tree::compression::ZstdDictionary causes compilation failures when no zstd
backend is enabled; wrap the import with the same cfg used by the backend (e.g.
add #[cfg(zstd_any)] above the use of ZstdDictionary) and ensure any code that
references ZstdDictionary (the benchmark setup and the zstd-specific branch that
currently falls back to a no-op) is also gated behind #[cfg(zstd_any)] so the
file compiles when the feature is absent.

In `@src/compression/zstd_pure.rs`:
- Around line 147-149: Replace the #[allow(...)] Clippy suppressions on the test
module with #[expect(..., reason = "...")] attributes: remove
#[allow(clippy::unwrap_used, clippy::expect_used, reason = "...")] on the tests
mod and add #[expect(clippy::unwrap_used, reason = "...")] and
#[expect(clippy::expect_used, reason = "...")] (one per lint) above mod tests so
the new test code uses Clippy expect annotations compatible with MSRV 1.92.
- Around line 105-122: The TLS reuse currently keys cached decoder by dict.id()
(the 32-bit truncated fingerprint), which can collide; change the cache key in
TLS_DECODER to a collision-resistant identifier (e.g., store and compare the
full dictionary bytes or a full-width hash) and reinitialize when that
identifier differs: when building the decoder in the TLS_DECODER closure (where
state is an Option<(u32, FrameDecoder)>), replace the 32-bit id with a safe key
derived from dict.raw() (or store dict.raw().to_vec() alongside the
FrameDecoder) and compare that key instead of dict.id() before deciding to reuse
the FrameDecoder created via Dictionary::decode_dict and FrameDecoder::add_dict.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ae6953a0-786f-4b6d-b182-2bbb82736130

📥 Commits

Reviewing files that changed from the base of the PR and between e0d4113 and 4ee8981.

📒 Files selected for processing (7)

Cargo.toml
benches/zstd_dict.rs
src/compression/mod.rs
src/compression/zstd_ffi.rs
src/compression/zstd_pure.rs
src/lib.rs
src/table/block/mod.rs

…re Rust dict decompress - Change TLS decoder cache key from truncated u32 to full u64 xxh3 fingerprint; eliminates cross-dict aliasing when two distinct dictionaries share the same lower 32 bits - Return DecompressedSizeTooLarge when decode_all_to_vec output exceeds capacity, matching the bounded behaviour of decompress() and the C backend - Add regression test: decompress_with_dict_rejects_frame_exceeding_capacity - Replace #[allow(clippy::...)] with two separate #[expect(..., reason)] attributes on the test module (MSRV 1.92 standard) - Gate bench constants and imports behind #[cfg(zstd_any)] so the file compiles with default features (no zstd backend enabled) - Document that cold bench measures TLS-hit path for pure Rust backend (same dict hash persists in TLS across iterations in the same thread)

polaz · 2026-04-06T23:37:30Z

@coderabbitai full review

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

… guard The FrameDecoder::init + bounded_read approach does not work: FrameDecoder processes the full frame at once and its Read impl returns 0 bytes after init unless driven by decode_all_to_vec. StreamingDecoder supports streaming reads; FrameDecoder does not. Restore decode_all_to_vec with an explicit post-decode capacity check: if output.len() > capacity return DecompressedSizeTooLarge, matching the bounded behaviour of decompress() and the C FFI backend. Add detailed comment explaining why bounded_read cannot be used for dict decompression.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

…semantics - Update ZstdDictionary::new doc: id stored as full 64-bit hash, id() truncates to u32 at call time (not at construction) - Tighten prepared field comment: ZSTD_DDict is cached per handle (not globally per unique bytes) via Arc<OnceLock<...>> - Strengthen decompress_with_dict_rejects_frame_exceeding_capacity: assert DecompressedSizeTooLarge variant specifically instead of is_err(); normalize FrameDecoderError::TargetTooSmall to DecompressedSizeTooLarge for a consistent public error API Addresses Copilot review threads #10, #11, #12.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.

polaz · 2026-04-07T07:03:07Z

@coderabbitai full review

coderabbitai · 2026-04-07T07:03:14Z

✅ Actions performed

Full review triggered.

## 🤖 New release * `coordinode-lsm-tree`: 4.3.1 -> 4.4.0 <details><summary>Changelog</summary> <blockquote> ## [4.4.0](v4.3.1...v4.4.0) - 2026-04-09 ### Added - *(compression)* enable dictionary compression in pure Rust backend ([#229](#229)) ### Performance - *(compression)* cache pre-compiled Dictionary across block decompress calls ([#227](#227)) </blockquote> </details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: sw-release-bot[bot] <255865126+sw-release-bot[bot]@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 6, 2026 21:47

Copilot started reviewing on behalf of polaz April 6, 2026 21:48 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/compression/zstd_pure.rs Outdated

Comment thread src/compression/zstd_pure.rs Outdated

Comment thread benches/zstd_dict.rs Outdated

Comment thread benches/zstd_dict.rs Outdated

coderabbitai Bot reviewed Apr 6, 2026

View reviewed changes

Comment thread benches/zstd_dict.rs Outdated

Comment thread benches/zstd_dict.rs Outdated

Comment thread src/compression/zstd_pure.rs Outdated

Comment thread src/compression/zstd_pure.rs

polaz requested a review from Copilot April 6, 2026 23:37

Copilot started reviewing on behalf of polaz April 6, 2026 23:37 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/compression/zstd_pure.rs Outdated

polaz requested a review from Copilot April 6, 2026 23:57

Copilot started reviewing on behalf of polaz April 6, 2026 23:57 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Comment thread src/compression/mod.rs

Comment thread src/compression/mod.rs Outdated

Comment thread src/compression/zstd_pure.rs

polaz requested a review from Copilot April 7, 2026 00:28

Copilot started reviewing on behalf of polaz April 7, 2026 00:29 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

polaz merged commit ab61d33 into main Apr 7, 2026
20 checks passed

polaz deleted the feat/#217-perfcompression-cache-pre-compiled-dictionary-acro branch April 7, 2026 16:23

sw-release-bot Bot mentioned this pull request Apr 7, 2026

chore: release v4.4.0 #228

Merged

polaz mentioned this pull request Apr 7, 2026

Roadmap: coordinode-lsm-tree performance & structured-zstd integration #215

Open

coderabbitai Bot mentioned this pull request Apr 9, 2026

feat(vlog): dictionary compression for blob files #233

Merged

Conversation

polaz commented Apr 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

codecov Bot commented Apr 6, 2026

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polaz commented Apr 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

polaz commented Apr 7, 2026

Uh oh!

coderabbitai Bot commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

polaz commented Apr 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading