Skip to content

feat(compression): enable dictionary compression in pure Rust backend#229

Merged
polaz merged 31 commits into
mainfrom
feat/#218-featcompression-enable-dictionary-compression-in-p
Apr 9, 2026
Merged

feat(compression): enable dictionary compression in pure Rust backend#229
polaz merged 31 commits into
mainfrom
feat/#218-featcompression-enable-dictionary-compression-in-p

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Apr 7, 2026

Summary

  • Implements `compress_with_dict()` in `ZstdPureProvider` using `FrameCompressor` from `structured-zstd` v0.0.11 (feat: dictionary builder — FastCOVER algorithm and dictionary finalization structured-zstd#25)
  • Removes the C FFI `zstd` crate entirely; `structured-zstd` is now the sole backend under the `zstd` feature flag
  • `zstd-pure` becomes a deprecated alias (`zstd-pure = ["zstd"]`) — enabling it is equivalent to enabling `zstd`
  • Supports both finalized zstd dictionaries (magic bytes `37 A4 30 EC` + entropy tables) and raw content dictionaries
  • TLS caching: single-entry `FrameCompressor` / `FrameDecoder` per thread, keyed by 64-bit xxh3 fingerprint + level
  • `strip_dict_id`: rewritten in-place (`get_mut` + `copy_within` + `truncate`) — eliminates O(frame_len) allocation per compressed block
  • Bug fix: `decode_raw_content_bounded` with `capacity=0` and an empty frame no longer incorrectly returns `DecompressedSizeTooLarge`; uses `remaining.max(1)` in `UptoBytes` so the decoder advances past the empty Last_Block before capacity is checked
  • CI: `test-zstd-pure` → `test-zstd`, matrix extended to include MSRV 1.92.0

Technical Details

Format detection: `compress_with_dict` and `decompress_with_dict` check for the zstd dictionary magic prefix (bytes `37 A4 30 EC`, little-endian `0xEC30A437`):

  • Finalized dict → `Dictionary::decode_dict` (entropy tables + content)
  • Raw content dict → `Dictionary::from_raw_content` with ID = lower 32 bits of xxh3, clamped to ≥1 (id=0 is reserved in the zstd frame format)

Raw-content dict ID stripping: `compress_with_dict` strips the synthetic dictID from the frame header after compression. This matches the zstd standard convention where `dictID=0` (absent) means "raw-content dict, id unknown, accept any", preventing decompressors from requiring the specific synthetic id.

Decompression-bomb guard: The raw-content dict decompress path calls `decoder.content_size()` after `init()` and rejects frames whose declared size exceeds the caller's capacity limit before allocating the output buffer. Frames without the FCS field fall back to the post-decode check.

`ZstdDictionary::id()`: Returns the raw lower 32 bits of xxh3 (may theoretically be 0). Config validation paths compare `dict.id()` against on-disk `dict_id` — both sides derive the same value, so validation is unaffected. The `.max(1)` clamp is only applied inside the backend when embedding an id in a zstd frame header.

Blocker resolved: structured-world/structured-zstd#25 (FastCOVER + dictionary finalization) merged; `structured-zstd` bumped to v0.0.11 (encoding performance improvements: row-based match finder, HC positions rebase, streaming scratch buffer reuse, FSE decoder packing, and HC table improvements).

Known Limitations

Test Plan

  • `src/compression/zstd_pure.rs` — roundtrip, magic detection, all levels, empty input, raw content dict, capacity guard, in-place strip_dict_id, empty-frame-at-capacity-0 regression
  • `tests/zstd_dict_roundtrip.rs` — full Tree write/flush/read, range scan, reopen, missing/wrong dict errors, finalized dict, encryption, per-level policy
  • `tests/zstd_dict_roundtrip.rs` — compaction path: 3 L0 SSTs flushed → `major_compact` → 300 keys readable + range scan (exercises both `compress_with_dict` and `decompress_with_dict` on the compaction hot path)
  • `tests/zstd_dict_roundtrip.rs` — reopen with wrong dict fails at recovery (`ZstdDictMismatch` on first SST read)
  • `cargo nextest run --no-default-features --features zstd,lz4`: 1208 passed
  • `cargo nextest run --all-features`: 1256 passed
  • `cargo clippy --all-features -- -D warnings`: clean
  • `cargo clippy --no-default-features --features zstd,lz4 --all-targets -- -D warnings`: clean
  • CI job `test-zstd` (renamed from `test-zstd-pure`) runs on `[stable, "1.92.0"]`

Closes #218

Summary by CodeRabbit

  • New Features

    • Added support for zstd dictionary compression and decompression (pure-Rust backend).
  • Documentation

    • Clarifies zstd now uses a pure‑Rust backend (no C toolchain), documents performance tradeoffs, current limitations, and marks the old alias as deprecated.
  • Tests

    • Substantially expanded unit and integration tests for dictionary behaviors and round‑trip/compaction scenarios.
  • Chores

    • CI and benchmark workflow updates; dependency and build-script feature wiring adjusted.

- Implement `compress_with_dict()` in `ZstdPureProvider` using
  `FrameCompressor::set_dictionary()` from `structured-zstd` v0.0.7,
  which added dictionary encoder support in structured-world/structured-zstd#25
- Support both finalized zstd dictionaries (magic 0x37A430EC + entropy
  tables, parsed via `decode_dict`) and raw content dictionaries (bare
  byte sequences, parsed via `from_raw_content` with xxh3-derived ID),
  matching the transparent handling in the C FFI backend
- Remove the config-time guard that rejected `ZstdDict` compression
  policies under the pure backend — the feature is now fully supported
- Update `decompress_with_dict` to apply the same format-detection
  logic so raw content dicts round-trip correctly end-to-end

Unit tests (src/compression/zstd_pure.rs):
  - compress_with_dict_roundtrip_pure_to_pure
  - compress_with_dict_produces_zstd_magic
  - compress_with_dict_roundtrip_all_levels (levels 1, 3, 9, 19)
  - compress_with_dict_empty_dict_returns_error
  - compress_with_dict_raw_content_dict_works
  - compress_with_dict_empty_plaintext_roundtrips

Integration tests (tests/zstd_dict_roundtrip.rs, zstd_pure_dict mod):
  - pure_tree_write_flush_read_zstd_dict (full Tree write→flush→read)
  - pure_tree_range_scan_with_zstd_dict
  - pure_tree_reopen_with_dict_reads_back_correctly
  - pure_zstd_dict_missing_returns_error

Closes #218
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 08b341ed-927a-4b05-9a14-033d60327186

📥 Commits

Reviewing files that changed from the base of the PR and between 5f2e3a0 and eb34b7f.

📒 Files selected for processing (1)
  • src/compression/zstd_pure.rs

📝 Walkthrough

Walkthrough

Replaced the C-FFI zstd backend with a single pure‑Rust structured‑zstd backend; implemented dictionary compression/decompression (finalized vs raw-content) with TLS caching and compressor cache; simplified ZstdDictionary API; removed zstd_ffi; updated Cargo/features, build script, CI, README, benchmarks, and tests.

Changes

Cohort / File(s) Summary
Feature & dependency wiring
Cargo.toml, build.rs
zstd now maps to optional structured-zstd; zstd-pure aliased to zstd; removed separate optional zstd dependency entry; build.rs emits zstd_any only when CARGO_FEATURE_ZSTD is set.
Pure Rust backend (dictionary support)
src/compression/zstd_pure.rs
Added DICT_MAGIC detection, strip_dict_id, bounded raw-content decoding, do_decompress_with_dict, full compress_with_dict implementation (FrameCompressor), TLS single-entry compressor cache keyed by (xxh3_64(dict_raw), level), raw vs finalized-dict paths, and extensive unit tests.
Compression module & types
src/compression/mod.rs
Removed C-FFI backend selection; ZstdBackend now resolves to pure‑Rust provider under zstd; removed prepared decoder field and accessor from ZstdDictionary; adjusted docs and id64 cfg visibility.
Removed C‑FFI backend
src/compression/zstd_ffi.rs
Deleted the C-FFI zstd backend and its ZstdFfiProvider implementation.
Configuration validation
src/config/mod.rs
Config::validate_zstd_dictionary() no longer rejects ZstdDict at compile-time under pure-Rust; now computes dict_id and validates presence/match for referenced dictionaries.
CI / Coverage / Benchmarks
.github/workflows/coordinode-ci.yml, .github/workflows/benchmark.yml, benches/zstd_dict.rs
Added test-zstd job and zstd-specific coverage run; updated benchmark checkout to actions/checkout@v6; renamed benchmark label to tls_hit and adjusted comments.
Docs & README
README.md
Updated feature-flag docs: zstd described as pure‑Rust structured-zstd backend (no C deps), performance note added; zstd-pure deprecated as alias.
Tests / Integration
tests/zstd_dict_roundtrip.rs
Added integration test zstd_dict_survives_major_compaction verifying Zstd-dict compression survives major compaction and validates reads and range scans.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant ZstdPure as ZstdPureProvider
    participant TLSCache
    participant Compressor as FrameCompressor
    participant Decoder as FrameDecoder

    rect rgba(200,200,255,0.5)
    Caller->>ZstdPure: compress_with_dict(input, dict_raw, level)
    ZstdPure->>TLSCache: lookup (xxh3_64(dict_raw), level)
    TLSCache-->>ZstdPure: compressor (create if missing)
    ZstdPure->>Compressor: compress(input, dict)
    Compressor-->>ZstdPure: compressed_frame
    ZstdPure->>ZstdPure: if raw-content dict → strip_dict_id(frame)
    ZstdPure-->>Caller: compressed_frame
    end

    rect rgba(200,255,200,0.5)
    Caller->>ZstdPure: decompress_with_dict(frame, dict_raw, capacity)
    ZstdPure->>ZstdPure: detect dict format (DICT_MAGIC / raw)
    alt raw-content path
        ZstdPure->>Decoder: init with force_dict + bounded decode
        Decoder-->>ZstdPure: decompressed_bytes / size_error
    else finalized-dict path
        ZstdPure->>Decoder: decode_all_to_vec(finalized)
        Decoder-->>ZstdPure: decompressed_bytes
    end
    ZstdPure-->>Caller: decompressed_bytes / error
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

  • #218: feat(compression): enable dictionary compression in pure Rust backend — This PR implements pure‑Rust dictionary compression/decompression requested by that issue.
  • #25: feat: dictionary builder — FastCOVER algorithm and dictionary finalization — The PR consumes finalized-dictionary format expectations (finalized dicts used by the compressor).
  • #232: (prepared-decoder caching proposal) — The PR replaces the prepared-field with TLS caching keyed by a 64-bit fingerprint, directly addressing the related caching/design surface.

Possibly related PRs

Poem

🐇
I hopped from FFI hollows to structured‑zstd bright,
Pure Rust frames now hum through day and night,
Dicts compress and decompress with care,
TLS caches warm, IDs stripped in the air,
Tests clap their paws — round trips pass just right.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(compression): enable dictionary compression in pure Rust backend' accurately describes the main change—implementing compress_with_dict() in ZstdPureProvider using structured-zstd.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from linked issues: #25 (FastCOVER/dictionary finalization in upstream structured-zstd, now resolved) and #218 (implement compress_with_dict in pure backend with round-trip compatibility and integration tests).
Out of Scope Changes check ✅ Passed All changes are directly related to enabling dictionary compression in the pure Rust backend: structured-zstd bump, C-FFI backend removal, ZstdPureProvider implementation updates, test additions, and CI/build configuration adjustments are all in scope.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#218-featcompression-enable-dictionary-compression-in-p

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 97.39583% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/compression/zstd_pure.rs 97.39% 5 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/mod.rs`:
- Around line 38-43: The documentation for the compression function is too
strict: update the doc comment that currently says "must be a finalized zstd
dictionary" to state that dict_raw accepts both finalized zstd dictionaries
(magic 0x37A430EC header, entropy tables, content) and bare content dictionaries
as produced/returned by ZstdDictionary::new() / ZstdDictionary::raw(); mention
that both the C FFI backend and the pure Rust backend accept both formats so
callers know either form is valid.

In `@src/compression/zstd_pure.rs`:
- Around line 111-118: The frame encoder currently clamps the xxh3 low-32-bit
hash to 1 locally (let id = { xxh3_64(dict_raw) as u32; h.max(1) }) causing
mismatch with ZstdDictionary::id() and CompressionType::ZstdDict; instead
compute a single normalized dict id from the raw hash in the shared
dictionary-ID path (e.g., produce raw_id = (xxh3_64(dict_raw) as u32) and
normalized_id = raw_id.max(1)) and use normalized_id everywhere metadata,
ZstdDictionary::id(), CompressionType::ZstdDict, and frame encoding read/write
IDs; remove the local h.max(1) clamping in the encoder and apply the same change
for the other occurrence around lines 179-183 so both metadata and frame headers
observe the identical normalized id.

In `@tests/zstd_dict_roundtrip.rs`:
- Around line 270-271: The zstd_pure_dict module is gated with #[cfg(all(feature
= "zstd-pure", not(feature = "zstd")))] and will be skipped by cargo test
--all-features, so add a dedicated CI job that runs tests with the pure backend
enabled (e.g., run cargo test --no-default-features --features zstd-pure or
cargo test --features zstd-pure while ensuring the "zstd" feature is not
enabled) so the zstd_pure_dict module is exercised; update your CI workflow to
include this new job (name it clearly, e.g., "test: zstd-pure") and ensure it
uses the same test matrix/platforms as the other test jobs.
- Around line 278-291: The current make_test_dictionary() only wraps raw sample
bytes with ZstdDictionary::new(), so tests never exercise the
finalized-dictionary path; change or add a test fixture that actually
trains/finalizes a dictionary (use the library's training/finalize API instead
of ZstdDictionary::new()), then use that finalized dictionary in an end-to-end
encode/decode test to exercise Dictionary::decode_dict and
FrameCompressor::set_dictionary_from_bytes; locate make_test_dictionary and
replace or add a make_finalized_test_dictionary that invokes the training
function (or calls the dictionary finalization method) on the same samples and
return the finalized ZstdDictionary for the new test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9f444fc6-12f8-4148-ae09-8969ce781690

📥 Commits

Reviewing files that changed from the base of the PR and between ab61d33 and 746a513.

📒 Files selected for processing (5)
  • README.md
  • src/compression/mod.rs
  • src/compression/zstd_pure.rs
  • src/config/mod.rs
  • tests/zstd_dict_roundtrip.rs
💤 Files with no reviewable changes (2)
  • README.md
  • src/config/mod.rs

Comment thread src/compression/mod.rs Outdated
Comment thread src/compression/zstd_pure.rs Outdated
Comment thread tests/zstd_dict_roundtrip.rs Outdated
Comment thread tests/zstd_dict_roundtrip.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables zstd dictionary compression for the zstd-pure backend (structured-zstd), removing the previous config-time rejection and adding unit + integration tests to exercise dictionary-compressed Tree roundtrips under the pure backend.

Changes:

  • Implement compress_with_dict() for ZstdPureProvider, including finalized-vs-raw dictionary format detection.
  • Remove the Config::open() guard that previously rejected CompressionType::ZstdDict under zstd-pure.
  • Add new unit tests (src/compression/zstd_pure.rs) and integration tests (tests/zstd_dict_roundtrip.rs) for pure-backend dictionary compression.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/compression/zstd_pure.rs Implements pure-backend dictionary compression and expands unit test coverage.
src/config/mod.rs Removes the pure-backend dictionary-compression rejection during config validation.
src/compression/mod.rs Updates trait-level docs to reflect dictionary support in the pure backend.
tests/zstd_dict_roundtrip.rs Adds integration tests for dict compression via the pure backend through the Tree API.
README.md Updates documentation by removing the old “dict compression unsupported” limitation for zstd-pure.

Comment thread src/compression/mod.rs Outdated
Comment thread tests/zstd_dict_roundtrip.rs Outdated
…-zero

- Move .max(1) into ZstdDictionary::id() so config validation, stored
  metadata, and zstd frame headers all observe the same dict id
  (id=0 is invalid in the zstd frame format; edge case when xxh3
  lower 32 bits are zero)
- Remove redundant .max(1) from decompress_with_dict raw-content path
- Update compress_with_dict trait doc to reflect that both finalized
  and raw-content dictionaries are accepted by both backends
- Add dedicated test-zstd-pure CI job to exercise the pure backend
  independently (--all-features enables C zstd and skips the
  not(feature = "zstd") gate)
- Add pure_finalized_dict_roundtrip integration test to cover the
  Dictionary::decode_dict path in the pure Rust backend
- Correct make_test_dictionary() doc: builds a raw-content dict, not
  a finalized-dictionary fixture
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment thread src/compression/mod.rs Outdated
Comment thread src/compression/zstd_pure.rs Outdated
Comment thread .github/workflows/coordinode-ci.yml Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/coordinode-ci.yml:
- Around line 62-80: Add a new step in the existing test-zstd-pure job to run
clippy under the pure-only feature combo: insert a step before the "Run tests"
step that runs `cargo clippy --no-default-features --features zstd-pure,lz4
--all-targets -- -D warnings`; this ensures code behind #[cfg(all(feature =
"zstd-pure", not(feature = "zstd")))] (e.g. src/compression/zstd_pure.rs and
tests/zstd_dict_roundtrip.rs) is linted and won't introduce warnings.

In `@src/compression/mod.rs`:
- Around line 158-176: The current normalization in ZstdDictionary::id() forces
id=0 to 1 and breaks read compatibility: revert or limit that normalization so
existing stored metadata with dict_id == 0 can still be recognized;
specifically, keep ZstdDictionary::id64() unchanged, change ZstdDictionary::id()
to return the raw lower 32 bits (including 0) for read/validation paths and
perform the clamp-to-1 only in write/encoding code paths that emit new frame
headers (i.e., where CompressionType::ZstdDict is serialized), or alternatively
make validation logic that compares dictionaries accept both raw 0 and
normalized 1 as equivalent for the same dictionary; update the code paths that
currently call ZstdDictionary::id() for config validation and block/table
open-time checks to use the raw value or the tolerant comparison, and add a
regression test that writes metadata with dict_id == 0 and verifies it can be
read back without triggering ZstdDictMismatch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d7702fe5-b3ec-47af-9ebb-251245a313de

📥 Commits

Reviewing files that changed from the base of the PR and between 746a513 and aacc1f1.

📒 Files selected for processing (4)
  • .github/workflows/coordinode-ci.yml
  • src/compression/mod.rs
  • src/compression/zstd_pure.rs
  • tests/zstd_dict_roundtrip.rs

Comment thread .github/workflows/coordinode-ci.yml Outdated
Comment thread src/compression/mod.rs Outdated
- Add TLS FrameCompressor cache to compress_with_dict in pure Rust backend,
  matching the existing FrameDecoder TLS cache in decompress_with_dict
- Cache key: (xxh3_64(dict_raw), level); dictionary is parsed at most once
  per thread per distinct (dict, level) pair
- Use owned Cursor<Vec<u8>> source and Vec<u8> drain for 'static TLS compat
- Revert ZstdDictionary::id() to return raw lower-32-bits of xxh3 (no clamping);
  backends that embed dict_id in the zstd frame header clamp to .max(1) themselves
  to preserve backward compat with on-disk dict_id=0 (theoretical)
- Update id() doc to clarify that clamping is the backend's responsibility
- Restore dict.id().max(1) in decompress_with_dict raw-content path
- Add cargo clippy step for pure backend in test-zstd-pure CI job
- Update benchmark.yml actions/checkout from v4 to v6
Comment thread .github/workflows/coordinode-ci.yml Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/compression/mod.rs:178

  • PR description says ZstdDictionary::id() is normalized to always return ≥1, but the current implementation returns the raw lower 32 bits of xxh3 and the docs explicitly note it may be 0. Since id=0 is reserved/invalid in zstd frame headers, leaving id() able to return 0 means on-disk CompressionType::ZstdDict{dict_id} can be 0 while the pure backend clamps the frame header dict ID to ≥1 for raw-content dictionaries. Please either implement the promised normalization (and adjust docs/tests accordingly) or update the PR description to match the intended semantics.
    /// Returns a 32-bit dictionary fingerprint (lower 32 bits of xxh3).
    ///
    /// Intended for config validation (matching a `CompressionType::ZstdDict`
    /// `dict_id` against the supplied `ZstdDictionary`) and external interop.
    ///
    /// The value is the raw lower 32 bits of xxh3 and may theoretically be `0`
    /// (probability ≈ 1/2³²). Backends that embed a dict ID in the zstd frame
    /// header (where id=0 is reserved) are responsible for clamping to at
    /// least 1 themselves. Config validation is unaffected: both sides derive
    /// the ID from the same bytes and therefore agree even in the zero case.
    ///
    /// For internal cache keying use [`id64`](ZstdDictionary::id64) to avoid
    /// hash collisions.
    #[must_use]
    #[expect(
        clippy::cast_possible_truncation,
        reason = "intentional: public API returns 32-bit fingerprint"
    )]
    pub fn id(&self) -> u32 {
        self.id as u32
    }

Comment thread src/compression/zstd_pure.rs Outdated
Comment thread src/compression/zstd_pure.rs Outdated
- Fix module-level doc: remove reference to set_dictionary_from_bytes
  (implementation uses set_dictionary with a parsed Dictionary)
- Reuse the exhausted Cursor<Vec<u8>> source buffer across compress() calls
  via take_source().map_or_else(...) to avoid a per-block O(data.len())
  allocation after the first call on a given thread; drain ownership is
  transferred to the caller so its capacity cannot be recovered without
  an extra copy
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/zstd_pure.rs`:
- Line 130: Extract the duplicated DICT_MAGIC array into a single module-level
constant and use it wherever the inline definition appears: add a top-level
const DICT_MAGIC: [u8; 4] = [0x37, 0xA4, 0x30, 0xEC] (with doc comment) near the
file imports, then remove the inline definitions in the functions (e.g., the
current local DICT_MAGIC in decompress_with_dict) and replace their uses with
the new module-level DICT_MAGIC constant so the magic value is defined once.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9739d26c-0a93-4be1-baaf-f423a58fe9c8

📥 Commits

Reviewing files that changed from the base of the PR and between 7650e6b and 1aec535.

📒 Files selected for processing (1)
  • src/compression/zstd_pure.rs

Comment thread src/compression/zstd_pure.rs Outdated
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 7, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 7, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/zstd_pure.rs`:
- Around line 146-159: The test suite is missing a cross-backend regression test
for raw-content dictionaries: add a test that creates a raw-content dictionary
(use Dictionary::from_raw_content with the synthetic id logic based on dict_key
and DICT_MAGIC/Dictionary::decode_dict semantics), compresses data using the
pure backend, then attempts to decompress the resulting frame using the FFI
backend to assert successful round-trip; ensure the test exercises the path
where dict_raw does not start with DICT_MAGIC so the synthetic id branch (the id
computed from dict_key) is used, and verify decompression succeeds despite the
differing dictID treatment between backends.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 027cfd5b-791d-4f90-a206-b3c0f3d2f603

📥 Commits

Reviewing files that changed from the base of the PR and between ab61d33 and 59bd591.

📒 Files selected for processing (7)
  • .github/workflows/benchmark.yml
  • .github/workflows/coordinode-ci.yml
  • README.md
  • src/compression/mod.rs
  • src/compression/zstd_pure.rs
  • src/config/mod.rs
  • tests/zstd_dict_roundtrip.rs
💤 Files with no reviewable changes (2)
  • README.md
  • src/config/mod.rs

Comment thread src/compression/zstd_pure.rs
The pure Rust backend (structured-zstd) rejects dictID=0 for
Dictionary::from_raw_content, so it internally assigns a synthetic
non-zero ID derived from xxh3_64(dict_raw). The C zstd library always
records dictID=0 (absent) for raw-content dicts, causing two failures:

  pure → FFI: pure embeds synthetic dictID in frame header;
  libzstd ZSTD_DDict has id=0; decompressor checks
  dctx->dictID(0) != fParams.dictID(synthetic) → "Dictionary mismatch".

  FFI → pure: FFI records dictID=0 in frame; structured-zstd
  treats dictID=0 as "no dict required" and skips dict lookup,
  then fails decompression with NotEnoughBytesInDictionary.

Fix the pure backend in both directions:

  compress_with_dict (raw-content): after compression, strip the
  Dict_ID field from the zstd frame header via strip_dict_id().
  The synthetic ID is an internal detail — removing it aligns the
  output with the C convention (dictID absent/0), so the C FFI
  decompressor skips the ID check and accepts the frame.

  decompress_with_dict (raw-content): replace decode_all_to_vec
  (which calls init() internally and skips dict loading when
  dictID=0) with the manual init() → force_dict() → decode_blocks()
  → collect() flow. force_dict() loads the raw-content dict
  regardless of the frame's dictID field, handling all three cases:
    - Frame from C FFI backend (dictID absent): force_dict loads dict.
    - Frame from new pure backend (dictID stripped): same.
    - Frame from old pure backend (dictID=synthetic): force_dict
      reloads same dict (idempotent).

Add cross-backend interoperability tests gated on
#[cfg(all(test, feature = "zstd", feature = "zstd-pure"))]:
  pure_compress_ffi_decompress_raw_content_dict_roundtrip
  ffi_compress_pure_decompress_raw_content_dict_roundtrip

mod zstd_pure is now compiled under #[cfg(feature = "zstd-pure")]
(previously not(feature = "zstd")) so it is visible to the
cross-backend test module when --all-features is used.

Closes #218
Copy link
Copy Markdown

@sw-release-bot sw-release-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.

Benchmark suite Current: ffd9512 Previous: ab61d33 Ratio
mergerandom 261870.7610778383 ops/sec (normalized) 380260.6261186645 ops/sec (normalized) 1.45

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/zstd_pure.rs`:
- Around line 151-174: The TLS variables (e.g., TLS_COMPRESSOR holding
Option<(u64, i32, CachedCompressor)>) implement a single-entry memoizer contrary
to the comment; replace the Option with a small keyed cache (HashMap or tiny
LRU) keyed by (dict_hash: u64, level: i32) so a thread can store multiple
CachedCompressor entries and avoid re-parsing when switching
dictionaries/levels; update the thread_local! RefCell<Option<...>> declarations
to RefCell<HashMap<(u64, i32), CachedCompressor>> or a bounded LRU map and apply
the same change to the analogous TLS cells referenced in the diff (the decoder
TLS instances), ensuring lookups insert/evict correctly and keys use
dict.id64()/level as described in the comments.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6b0df8e2-09ae-4eda-a5fd-d5d8f7ca0431

📥 Commits

Reviewing files that changed from the base of the PR and between 59bd591 and 9ce1517.

📒 Files selected for processing (2)
  • src/compression/mod.rs
  • src/compression/zstd_pure.rs

Comment thread src/compression/zstd_pure.rs
@polaz polaz requested a review from Copilot April 7, 2026 21:25
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Comment thread tests/zstd_dict_roundtrip.rs
Comment thread src/compression/mod.rs Outdated
Comment thread src/compression/zstd_pure.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/compression/zstd_pure.rs (1)

344-345: ⚠️ Potential issue | 🟡 Minor

Keep the finalized-dictionary magic wording consistent.

These comments still spell the header as 0x37A430EC, which reintroduces the endian confusion you already fixed for DICT_MAGIC. Reword them to reference the byte sequence (37 A4 30 EC) or the corrected little-endian value instead.

Also applies to: 499-503

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/compression/zstd_pure.rs` around lines 344 - 345, Update the
finalized-dictionary comment wording so it does not use the big-endian hex
literal 0x37A430EC (which causes endian confusion); instead reference the byte
sequence "37 A4 30 EC" or the corrected little-endian numeric value wherever the
header is described (e.g., the comments near DICT_MAGIC and the
finalized-dictionary descriptions in src/compression/zstd_pure.rs, including the
other occurrence around the block that spans lines ~499-503). Ensure both places
use the same phrasing ("byte sequence 37 A4 30 EC" or the corrected
little-endian value) so the comment is unambiguous across platforms and matches
the DICT_MAGIC constant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/coordinode-ci.yml:
- Around line 122-124: The coverage comment on the workflow step that runs
`cargo +nightly llvm-cov --no-report nextest --no-default-features --features
zstd,lz4` is misleading because `--all-features` now includes the `zstd` backend
(since `zstd-pure` is an alias), so update the comment to state that this extra
run validates a narrower feature combination (zstd and lz4 without default
features) rather than exercising dictionary-compression paths missed by
`--all-features`; keep the note that the extra run is intentional for validating
that specific feature set and remove the incorrect claim about exercising
dictionary paths.

In `@src/compression/mod.rs`:
- Around line 28-34: Update the doc comment for the compression function to
avoid implying ZstdDictionary::raw() only yields a finalized dictionary:
rephrase the sentence under "Compress `data` using a zstd dictionary." so it
states that `dict_raw` may be either a finalized zstd dictionary (magic
0x37A430EC, entropy tables, content — produced by `zstd --train`) or raw content
bytes, and that callers may pass bytes produced by `zstd --train` or bytes
obtained from `ZstdDictionary::raw()`; make clear the zstd backend accepts
either representation. Target the doc block surrounding the function that begins
with "Compress `data` using a zstd dictionary." and adjust the wording
accordingly.

In `@src/compression/zstd_pure.rs`:
- Around line 179-183: Replace the misleading "// SAFETY:" comment above the
slice access with "// Invariant:" because this is documenting a safe
slice-bounds invariant (not reasoning for an unsafe block); update the comment
that precedes the call to output.get_mut(prev_len..) / unwrap_or_else(...) so it
reads "// Invariant: ..." and explains that output was resized to new_len and
prev_len.. is a valid slice (referencing variables output, prev_len, new_len,
and dest to locate the code).

---

Duplicate comments:
In `@src/compression/zstd_pure.rs`:
- Around line 344-345: Update the finalized-dictionary comment wording so it
does not use the big-endian hex literal 0x37A430EC (which causes endian
confusion); instead reference the byte sequence "37 A4 30 EC" or the corrected
little-endian numeric value wherever the header is described (e.g., the comments
near DICT_MAGIC and the finalized-dictionary descriptions in
src/compression/zstd_pure.rs, including the other occurrence around the block
that spans lines ~499-503). Ensure both places use the same phrasing ("byte
sequence 37 A4 30 EC" or the corrected little-endian value) so the comment is
unambiguous across platforms and matches the DICT_MAGIC constant.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6ec60988-727d-4765-8fed-1ff6a08976e0

📥 Commits

Reviewing files that changed from the base of the PR and between ab61d33 and ef55325.

📒 Files selected for processing (11)
  • .github/workflows/benchmark.yml
  • .github/workflows/coordinode-ci.yml
  • Cargo.toml
  • README.md
  • benches/zstd_dict.rs
  • build.rs
  • src/compression/mod.rs
  • src/compression/zstd_ffi.rs
  • src/compression/zstd_pure.rs
  • src/config/mod.rs
  • tests/zstd_dict_roundtrip.rs
💤 Files with no reviewable changes (2)
  • src/config/mod.rs
  • src/compression/zstd_ffi.rs

Comment thread .github/workflows/coordinode-ci.yml Outdated
Comment thread src/compression/mod.rs Outdated
Comment thread src/compression/zstd_pure.rs Outdated
…issues

- Standardize '0x37A430EC' → bytes '37 A4 30 EC' in all comments
  (mod.rs compress_with_dict, zstd_pure.rs compress/decompress paths)
- mod.rs compress_with_dict: clarify ZstdDictionary::raw() is an accessor,
  not a dict producer; mention it's accessible for persistence/interop
- zstd_pure.rs decode_raw_content_bounded: rename '// SAFETY:' → '// Invariant:'
  (not an unsafe block, documents safe slice-bounds invariant)
- zstd_dict_roundtrip.rs: fix assertion message 'L0 SSTs' → 'tables'
  (table_count() returns total across all levels, not L0-only)
- coordinode-ci.yml: correct coverage comment — zstd-pure is now an alias
  for zstd, so the extra run validates a narrower feature set, not dict paths
  missed by --all-features
@polaz polaz requested a review from Copilot April 8, 2026 22:46
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 8, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 8, 2026

@polaz Sure! Triggering a fresh re-review to pick up any remaining issues after the latest round of fixes.


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:165-176
Timestamp: 2026-04-08T16:02:08.879Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), the `decode_raw_content_bounded` loop's early-return guard was fixed by replacing `if remaining == 0` with `if remaining.max(1) == 0` (effectively: `remaining > 0` is no longer required before calling `decode_blocks`). The root cause: for empty-plaintext frames with raw-content dictionaries the FCS is 0, so the decompression-bomb pre-check in `do_decompress_with_dict` is skipped, and `decode_raw_content_bounded` is entered with `capacity=0`. On the first iteration `!decoder.is_finished()` is true (no blocks decoded yet) and the old `remaining == 0` guard fired prematurely, returning `DecompressedSizeTooLarge` for a legitimate empty frame. The fix allows the decoder to advance past the empty `Last_Block` before checking capacity. Regression tests `..._at_capacity_one` and `..._at_exact_capacity` cover this path. Do not re-flag the `remaining == 0` check as a bug in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-08T21:33:42.308Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), both TLS caches (`TLS_COMPRESSOR` and `TLS_DECODER`) are intentionally single-entry memoizers keyed by `(xxh3_64(dict_raw), level)`. They evict-and-replace on a dictionary/level change rather than growing a keyed map. This is sufficient because an LSM-tree compaction job uses one active dictionary per thread. A multi-entry keyed cache for workloads interleaving multiple dicts per thread is a valid follow-up optimization tracked in issue `#231`. Do not flag the single-entry TLS design as missing a multi-entry keyed cache in future reviews of this repository.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:151-174
Timestamp: 2026-04-07T21:55:23.591Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), both TLS caches (`TLS_COMPRESSOR` and `TLS_DECODER`) are intentionally single-entry memoizers: each stores `Option<(key, ..., compressor/decoder)>` and evicts-and-replaces on a dictionary/level change rather than growing a keyed map. This is sufficient because an LSM-tree typically uses one active dictionary. Do not flag the single-entry TLS design as missing a multi-entry keyed cache in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:151-174
Timestamp: 2026-04-07T21:55:23.591Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), the raw-content dict decompression path uses a `content_size()` decompression-bomb guard (added in the same commit as the TLS comment fix) to limit output size before allocating, consistent with the finalized-dict path. Do not flag the absence of this guard in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:146-159
Timestamp: 2026-04-07T20:57:55.760Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), cross-backend raw-content dictionary interop was fixed in commit 9ce15176:
- Pure → FFI: after `FrameCompressor::compress()` for a raw-content dictionary, `strip_dict_id()` is called on the output frame to remove the synthetic dictID (which structured-zstd requires internally but must not appear on the wire for C-backend compatibility).
- FFI → pure: `decompress_with_dict` for raw-content dicts now uses the manual `init() → force_dict() → decode_blocks() → collect()` flow instead of `decode_all_to_vec()`. `force_dict` loads the dictionary unconditionally regardless of the frame's dictID field, handling FFI-produced frames (dictID=0), new pure frames (dictID stripped), and old pure frames (dictID=synthetic). Do not flag the absence of `decode_all_to_vec` for raw-content decompression in future reviews.
- Regression tests live in `cross_backend_interop_tests` under `#[cfg(all(test, feature = "zstd", feature = "zstd-pure"))]`.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 202
File: src/table/tests.rs:2113-2118
Timestamp: 2026-04-02T23:53:51.122Z
Learning: In `src/table/block_index/iter.rs` and `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `trim_back_to_upper_bound` contains an unconditional restore (the `else { true }` branch at lines ~787-788 of decoder.rs) that always re-pushes a covering entry onto `hi_scanner.stack` when the stack would otherwise empty. This means `seek_upper_bound_cursor` always returns `true` for well-formed block data, regardless of `restart_interval`. Consequently, the `OwnedIndexBlockIter::from_block_with_bounds` returning `Ok(None)` (empty-child path) cannot be forced from test configuration (e.g., `use_index_block_restart_interval(4)`) on non-corrupt data — it is only reachable via `fill_stack`/`advance_upper_restart_interval` decode failure. Regression tests for the two-level index empty-child skip behavior (PR `#202`, issue `#194`) validate overall iteration correctness defensively, not the exact `Ok(None)` corruption path. Do not suggest forcing this path via `restart_interval` changes in reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T22:48:06.079Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the "binary search trusts semantically-corrupted binary index" concern (lines 245-299) is explicitly out of scope for PR `#182`. The scenario requires a corrupted binary index that passes the block checksum (encoder bug or checksum collision). Adding O(n) monotonicity validation per seek call would regress the hot path for a case already caught by checksums. This hardening is tracked in issue `#184` (fallible decoder constructor). Do not flag the absence of monotonicity validation on the binary-index seek path in reviews of PR `#182` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-08T18:47:20.584Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), `do_decompress_with_dict` accepts a pre-computed `raw_content_id: u32` parameter (callers pass `dict.id().max(1)` from the cached `ZstdDictionary` handle) to avoid re-hashing `dict_raw` on every decompression call. Do not flag the absence of an inline `xxh3_64(dict_raw)` call inside `do_decompress_with_dict` as a missing ID derivation in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-27T07:10:44.958Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the restart-key span validation is intentionally split: strict bounds hardening (buffer/offset checks) was implemented in PR `#182`, while full semantic restart-key span validation (which requires passing the restart-key span through decoder state) is tracked separately in issue `#183`. Do not flag the absence of semantic-span validation in decoder.rs as an unaddressed gap in reviews of PR `#182` or earlier code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T14:37:26.098Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the `unwrap!` calls on `read_u8`/`read_u32` inside `Decoder::try_new()` (lines ~139–151) will panic on a truncated trailer slice. This is guarded at a higher level: `Block::from_reader` and checksum validation reject undersized blocks before they reach the decoder, and `Trailer::trailer_offset()` would underflow on undersized blocks — making this a pre-existing pipeline invariant, not introduced by PR `#191`. Making `Trailer::as_slice()` and its downstream reads fallible to handle truncated/undersized blocks gracefully is tracked in issue `#192`. Do not re-flag the `unwrap!` reads in `Decoder::try_new()` as an unaddressed gap in reviews of PR `#191` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.867Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 182
File: src/table/block/decoder.rs:236-243
Timestamp: 2026-03-30T17:47:54.258Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `get_key_at` calls `Item::parse_full` on every binary-search probe inside `partition_point` / `partition_point_2` as a result of the bounds-hardening in PR `#182` (which replaced the previous unsafe `get_unchecked` + `parse_restart_key` path). Threading `entries_end` into `parse_restart_key()` to restore a lighter bounded fast path is tracked in issue `#190`. Do not flag the full-item decode in `get_key_at` as an unaddressed gap in reviews of PR `#182` or later code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T04:38:24.388Z
Learning: In `src/table/index_block/block_handle.rs` (structured-world/coordinode-lsm-tree), `parse_full` bounds hardening, `#[must_use]` annotations on parser methods, and remaining `value_type` unwrap paths in `parse_full` are pre-existing code outside the diff of PR `#182`. The broader hardening (including a fallible decoder constructor) is tracked in issue `#184`. Do not flag these as unaddressed gaps in reviews of PR `#182`; they are known, tracked follow-ups.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T10:23:54.480Z
Learning: In `src/manifest.rs` (structured-world/coordinode-lsm-tree), both manifest test helpers (`write_test_manifest` and the related decode helper) were updated in PR `#212` (commit 61016156) with doc comments explaining why all four sfa sections (format_version, tree_type, level_count, filter_hash_type) must be written — the sfa archive format requires all sections to be present for a valid manifest. Do not flag the absence of this explanation in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.461Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all 5 GC unit tests were updated in PR `#212` to seed distinct `v{id}` files (IDs 1–4) into MemFs before calling `maintenance`, and to assert that the correct files are deleted or preserved after the call. Do not flag the absence of seeded v* files in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T16:07:14.590Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), after commit 40361df in PR `#191`, `Decoder::try_new` returns `Err(crate::Error::InvalidTrailer)` when `compute_entries_end()` returns `None` (invalid binary/hash-index layout), instead of silently accepting the block with `cached_entries_end = None`. Do not flag the absence of this check in reviews of PR `#191` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-06T10:54:29.157Z
Learning: In `Cargo.toml` (structured-world/coordinode-lsm-tree, PR `#214`), the `structured-zstd` dependency was updated to version `0.0.7` (latest as of the update). Version `0.0.6` (used earlier in the PR) was always a valid published version. Do not flag either version as invalid in reviews of PR `#214`.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 8, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 8, 2026

@polaz Sure! Triggering a fresh re-review to pick up any remaining issues after the latest round of fixes.


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:165-176
Timestamp: 2026-04-08T16:02:08.879Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), the `decode_raw_content_bounded` loop's early-return guard was fixed by replacing `if remaining == 0` with `if remaining.max(1) == 0` (effectively: `remaining > 0` is no longer required before calling `decode_blocks`). The root cause: for empty-plaintext frames with raw-content dictionaries the FCS is 0, so the decompression-bomb pre-check in `do_decompress_with_dict` is skipped, and `decode_raw_content_bounded` is entered with `capacity=0`. On the first iteration `!decoder.is_finished()` is true (no blocks decoded yet) and the old `remaining == 0` guard fired prematurely, returning `DecompressedSizeTooLarge` for a legitimate empty frame. The fix allows the decoder to advance past the empty `Last_Block` before checking capacity. Regression tests `..._at_capacity_one` and `..._at_exact_capacity` cover this path. Do not re-flag the `remaining == 0` check as a bug in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-08T21:33:42.308Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), both TLS caches (`TLS_COMPRESSOR` and `TLS_DECODER`) are intentionally single-entry memoizers keyed by `(xxh3_64(dict_raw), level)`. They evict-and-replace on a dictionary/level change rather than growing a keyed map. This is sufficient because an LSM-tree compaction job uses one active dictionary per thread. A multi-entry keyed cache for workloads interleaving multiple dicts per thread is a valid follow-up optimization tracked in issue `#231`. Do not flag the single-entry TLS design as missing a multi-entry keyed cache in future reviews of this repository.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:151-174
Timestamp: 2026-04-07T21:55:23.591Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), both TLS caches (`TLS_COMPRESSOR` and `TLS_DECODER`) are intentionally single-entry memoizers: each stores `Option<(key, ..., compressor/decoder)>` and evicts-and-replaces on a dictionary/level change rather than growing a keyed map. This is sufficient because an LSM-tree typically uses one active dictionary. Do not flag the single-entry TLS design as missing a multi-entry keyed cache in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:151-174
Timestamp: 2026-04-07T21:55:23.591Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), the raw-content dict decompression path uses a `content_size()` decompression-bomb guard (added in the same commit as the TLS comment fix) to limit output size before allocating, consistent with the finalized-dict path. Do not flag the absence of this guard in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:146-159
Timestamp: 2026-04-07T20:57:55.760Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), cross-backend raw-content dictionary interop was fixed in commit 9ce15176:
- Pure → FFI: after `FrameCompressor::compress()` for a raw-content dictionary, `strip_dict_id()` is called on the output frame to remove the synthetic dictID (which structured-zstd requires internally but must not appear on the wire for C-backend compatibility).
- FFI → pure: `decompress_with_dict` for raw-content dicts now uses the manual `init() → force_dict() → decode_blocks() → collect()` flow instead of `decode_all_to_vec()`. `force_dict` loads the dictionary unconditionally regardless of the frame's dictID field, handling FFI-produced frames (dictID=0), new pure frames (dictID stripped), and old pure frames (dictID=synthetic). Do not flag the absence of `decode_all_to_vec` for raw-content decompression in future reviews.
- Regression tests live in `cross_backend_interop_tests` under `#[cfg(all(test, feature = "zstd", feature = "zstd-pure"))]`.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 202
File: src/table/tests.rs:2113-2118
Timestamp: 2026-04-02T23:53:51.122Z
Learning: In `src/table/block_index/iter.rs` and `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `trim_back_to_upper_bound` contains an unconditional restore (the `else { true }` branch at lines ~787-788 of decoder.rs) that always re-pushes a covering entry onto `hi_scanner.stack` when the stack would otherwise empty. This means `seek_upper_bound_cursor` always returns `true` for well-formed block data, regardless of `restart_interval`. Consequently, the `OwnedIndexBlockIter::from_block_with_bounds` returning `Ok(None)` (empty-child path) cannot be forced from test configuration (e.g., `use_index_block_restart_interval(4)`) on non-corrupt data — it is only reachable via `fill_stack`/`advance_upper_restart_interval` decode failure. Regression tests for the two-level index empty-child skip behavior (PR `#202`, issue `#194`) validate overall iteration correctness defensively, not the exact `Ok(None)` corruption path. Do not suggest forcing this path via `restart_interval` changes in reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T22:48:06.079Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the "binary search trusts semantically-corrupted binary index" concern (lines 245-299) is explicitly out of scope for PR `#182`. The scenario requires a corrupted binary index that passes the block checksum (encoder bug or checksum collision). Adding O(n) monotonicity validation per seek call would regress the hot path for a case already caught by checksums. This hardening is tracked in issue `#184` (fallible decoder constructor). Do not flag the absence of monotonicity validation on the binary-index seek path in reviews of PR `#182` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-08T18:47:20.584Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), `do_decompress_with_dict` accepts a pre-computed `raw_content_id: u32` parameter (callers pass `dict.id().max(1)` from the cached `ZstdDictionary` handle) to avoid re-hashing `dict_raw` on every decompression call. Do not flag the absence of an inline `xxh3_64(dict_raw)` call inside `do_decompress_with_dict` as a missing ID derivation in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-27T07:10:44.958Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the restart-key span validation is intentionally split: strict bounds hardening (buffer/offset checks) was implemented in PR `#182`, while full semantic restart-key span validation (which requires passing the restart-key span through decoder state) is tracked separately in issue `#183`. Do not flag the absence of semantic-span validation in decoder.rs as an unaddressed gap in reviews of PR `#182` or earlier code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T14:37:26.098Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the `unwrap!` calls on `read_u8`/`read_u32` inside `Decoder::try_new()` (lines ~139–151) will panic on a truncated trailer slice. This is guarded at a higher level: `Block::from_reader` and checksum validation reject undersized blocks before they reach the decoder, and `Trailer::trailer_offset()` would underflow on undersized blocks — making this a pre-existing pipeline invariant, not introduced by PR `#191`. Making `Trailer::as_slice()` and its downstream reads fallible to handle truncated/undersized blocks gracefully is tracked in issue `#192`. Do not re-flag the `unwrap!` reads in `Decoder::try_new()` as an unaddressed gap in reviews of PR `#191` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.867Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 182
File: src/table/block/decoder.rs:236-243
Timestamp: 2026-03-30T17:47:54.258Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `get_key_at` calls `Item::parse_full` on every binary-search probe inside `partition_point` / `partition_point_2` as a result of the bounds-hardening in PR `#182` (which replaced the previous unsafe `get_unchecked` + `parse_restart_key` path). Threading `entries_end` into `parse_restart_key()` to restore a lighter bounded fast path is tracked in issue `#190`. Do not flag the full-item decode in `get_key_at` as an unaddressed gap in reviews of PR `#182` or later code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T04:38:24.388Z
Learning: In `src/table/index_block/block_handle.rs` (structured-world/coordinode-lsm-tree), `parse_full` bounds hardening, `#[must_use]` annotations on parser methods, and remaining `value_type` unwrap paths in `parse_full` are pre-existing code outside the diff of PR `#182`. The broader hardening (including a fallible decoder constructor) is tracked in issue `#184`. Do not flag these as unaddressed gaps in reviews of PR `#182`; they are known, tracked follow-ups.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-28T13:41:55.184Z
Learning: In `src/table/index_block/block_handle.rs` (structured-world/coordinode-lsm-tree), the `parse_truncated` function now validates `shared_prefix_len` against both the entry-start boundary AND the restart-key span (`base_key_end`). This was fixed in PR `#182` commit `f64f90e` (originally tracked as follow-up issue `#186`). The function signature was updated to accept a `base_key_end: usize` parameter, and the prefix-end check was changed from `prefix_end > offset` to `prefix_end > base_key_end`. Do not flag the absence of this restart-key span check in reviews of PR `#182` or later code — it is fully implemented.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T10:23:54.480Z
Learning: In `src/manifest.rs` (structured-world/coordinode-lsm-tree), both manifest test helpers (`write_test_manifest` and the related decode helper) were updated in PR `#212` (commit 61016156) with doc comments explaining why all four sfa sections (format_version, tree_type, level_count, filter_hash_type) must be written — the sfa archive format requires all sections to be present for a valid manifest. Do not flag the absence of this explanation in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.461Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all 5 GC unit tests were updated in PR `#212` to seed distinct `v{id}` files (IDs 1–4) into MemFs before calling `maintenance`, and to assert that the correct files are deleted or preserved after the call. Do not flag the absence of seeded v* files in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-06T10:54:29.157Z
Learning: In `Cargo.toml` (structured-world/coordinode-lsm-tree, PR `#214`), the `structured-zstd` dependency was updated to version `0.0.7` (latest as of the update). Version `0.0.6` (used earlier in the PR) was always a valid published version. Do not flag either version as invalid in reviews of PR `#214`.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.237Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), `SuperVersions::maintenance` GC loop was fixed in PR `#212` (commit ecfa47c8) to eliminate the TOCTOU between `fs.exists()` and `fs.remove_file()`. The canonical fix calls `fs.remove_file()` directly and treats `ErrorKind::NotFound` as success. Do not flag the absence of a preceding `exists()` guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T10:11:17.013Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), the `table_count` and `blob_file_count` corruption unit tests were updated in PR `#212` (commit 49a23065) to assert `matches!(err, crate::Error::Unrecoverable)` instead of generic `is_err()`. This pins the exact error variant returned by the bounds-check path. Do not flag the absence of this precise assertion in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.237Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all maintenance-function unit tests were updated in PR `#212` (commit ecfa47c8) to use `MemFs` instead of `Path::new(".")` + `StdFs`, making them fully isolated from the real filesystem. Do not flag real-filesystem usage in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:13:22.289Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the `tree_reopen_rejects_non_utf8_filename_in_data_dir` test was updated in PR `#212` (commit 77785031) to return `lsm_tree::Result<()>` and propagate errors with `?` instead of calling `unwrap()` on I/O paths. This satisfies the repo-wide no-`unwrap()`-on-I/O-paths rule. Do not flag `unwrap()` usage on I/O paths in that test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:48:39.014Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `cleanup_orphaned_version` was updated in PR `#212` (commit 6043501b) so that `remove_file` treats `ErrorKind::NotFound` as success, mirroring the idempotent-delete pattern used in `SuperVersions::maintenance`. Do not flag the absence of a `NotFound` guard in `cleanup_orphaned_version` in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.461Z
Learning: In `src/config/mod.rs` (structured-world/coordinode-lsm-tree), the `with_fs` and `with_shared_fs` constructor doc comments were updated (beyond the `Config::fs` field doc) in PR `#212` to remove stale "reopen limitation" notes, reflecting that MemFs-backed trees can now be fully reopened. Do not flag those constructor docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/zstd_dict_roundtrip.rs`:
- Around line 264-294: Add an explicit postcondition after calling
tree.major_compact(...) to assert that compaction actually reduced L0 tables (or
total SST count): query the tree for its table/level state (e.g., use whatever
API your tree exposes such as a method that returns number of tables or tables
per level, e.g., tree.num_tables_in_level(0) or
tree.table_count()/tree.levels()) and assert that L0 is zero or the total table
count decreased compared to pre-compaction; place this check immediately after
the major_compact call (before the read/scan assertions) and fail the test if
compaction did not occur.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2925a12a-264b-4b96-86eb-0520641c0045

📥 Commits

Reviewing files that changed from the base of the PR and between ef55325 and c9c7ddf.

📒 Files selected for processing (4)
  • .github/workflows/coordinode-ci.yml
  • src/compression/mod.rs
  • src/compression/zstd_pure.rs
  • tests/zstd_dict_roundtrip.rs

Comment thread tests/zstd_dict_roundtrip.rs
…t test

Without this check, if major_compact() ever regresses to a no-op that still
returns Ok(()), the test would pass against the original L0 tables without
exercising the dict compress/decompress compaction hot path.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Comment thread tests/zstd_dict_roundtrip.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/mod.rs`:
- Around line 110-111: The doc comment for the handle is incorrect: `new()`
stores the full 64-bit xxh3 fingerprint in `self.id` (see `id64()`), while
`id()` truncates to the lower 32 bits; update the comment to state that the
handle stores the full 64‑bit xxh3 hash of `raw` and that `id()` returns the
lower 32 bits on demand (referencing `ZstdDictionary::id` and `id64()` for
clarity).

In `@tests/zstd_dict_roundtrip.rs`:
- Around line 269-272: The test currently masks failures by calling
tree.level_table_count(0).unwrap_or(0); instead, call level_table_count(0) and
assert on its success and result explicitly: retrieve the Result via
.expect(...) or .unwrap() with a clear message (e.g., on the call to
level_table_count), bind it to a variable (e.g., count) and then assert_eq!(0,
count, "L0 must be empty after major_compact — compaction may not have run");
this ensures failures to fetch the L0 count fail the test rather than being
coerced to zero.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0b978752-95b1-42f5-bebc-a380b250c4d8

📥 Commits

Reviewing files that changed from the base of the PR and between ef55325 and dd0275f.

📒 Files selected for processing (4)
  • .github/workflows/coordinode-ci.yml
  • src/compression/mod.rs
  • src/compression/zstd_pure.rs
  • tests/zstd_dict_roundtrip.rs

Comment thread src/compression/mod.rs Outdated
Comment thread tests/zstd_dict_roundtrip.rs
- Fix test postcondition: assert_eq!(Some(0), level_table_count(0)) instead
  of unwrap_or(0) — properly distinguishes "L0 exists and is empty" from
  unexpected None (Copilot #54, CodeRabbit #56)
- Fix ZstdDictionary::new() doc: handle stores full 64-bit xxh3 hash
  internally; id() returns lower 32 bits on demand (CodeRabbit #55)
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comment thread src/compression/zstd_pure.rs
Comment thread Cargo.toml
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

@polaz polaz merged commit abd0ee5 into main Apr 9, 2026
23 checks passed
@polaz polaz deleted the feat/#218-featcompression-enable-dictionary-compression-in-p branch April 9, 2026 10:16
@sw-release-bot sw-release-bot Bot mentioned this pull request Apr 9, 2026
polaz pushed a commit that referenced this pull request Apr 9, 2026
## 🤖 New release

* `coordinode-lsm-tree`: 4.3.1 -> 4.4.0

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

##
[4.4.0](v4.3.1...v4.4.0)
- 2026-04-09

### Added

- *(compression)* enable dictionary compression in pure Rust backend
([#229](#229))

### Performance

- *(compression)* cache pre-compiled Dictionary across block decompress
calls
([#227](#227))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: sw-release-bot[bot] <255865126+sw-release-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(compression): enable dictionary compression in pure Rust backend

2 participants