feat(compression): enable dictionary compression in pure Rust backend by polaz · Pull Request #229 · structured-world/coordinode-lsm-tree

polaz · 2026-04-07T16:48:36Z

Summary

Implements `compress_with_dict()` in `ZstdPureProvider` using `FrameCompressor` from `structured-zstd` v0.0.11 (feat: dictionary builder — FastCOVER algorithm and dictionary finalization structured-zstd#25)
Removes the C FFI `zstd` crate entirely; `structured-zstd` is now the sole backend under the `zstd` feature flag
`zstd-pure` becomes a deprecated alias (`zstd-pure = ["zstd"]`) — enabling it is equivalent to enabling `zstd`
Supports both finalized zstd dictionaries (magic bytes `37 A4 30 EC` + entropy tables) and raw content dictionaries
TLS caching: single-entry `FrameCompressor` / `FrameDecoder` per thread, keyed by 64-bit xxh3 fingerprint + level
`strip_dict_id`: rewritten in-place (`get_mut` + `copy_within` + `truncate`) — eliminates O(frame_len) allocation per compressed block
Bug fix: `decode_raw_content_bounded` with `capacity=0` and an empty frame no longer incorrectly returns `DecompressedSizeTooLarge`; uses `remaining.max(1)` in `UptoBytes` so the decoder advances past the empty Last_Block before capacity is checked
CI: `test-zstd-pure` → `test-zstd`, matrix extended to include MSRV 1.92.0

Technical Details

Format detection: `compress_with_dict` and `decompress_with_dict` check for the zstd dictionary magic prefix (bytes `37 A4 30 EC`, little-endian `0xEC30A437`):

Finalized dict → `Dictionary::decode_dict` (entropy tables + content)
Raw content dict → `Dictionary::from_raw_content` with ID = lower 32 bits of xxh3, clamped to ≥1 (id=0 is reserved in the zstd frame format)

Raw-content dict ID stripping: `compress_with_dict` strips the synthetic dictID from the frame header after compression. This matches the zstd standard convention where `dictID=0` (absent) means "raw-content dict, id unknown, accept any", preventing decompressors from requiring the specific synthetic id.

Decompression-bomb guard: The raw-content dict decompress path calls `decoder.content_size()` after `init()` and rejects frames whose declared size exceeds the caller's capacity limit before allocating the output buffer. Frames without the FCS field fall back to the post-decode check.

`ZstdDictionary::id()`: Returns the raw lower 32 bits of xxh3 (may theoretically be 0). Config validation paths compare `dict.id()` against on-disk `dict_id` — both sides derive the same value, so validation is unaffected. The `.max(1)` clamp is only applied inside the backend when embedding an id in a zstd frame header.

Blocker resolved: structured-world/structured-zstd#25 (FastCOVER + dictionary finalization) merged; `structured-zstd` bumped to v0.0.11 (encoding performance improvements: row-based match finder, HC positions rebase, streaming scratch buffer reuse, FSE decoder packing, and HC table improvements).

Known Limitations

TLS compressor/decoder is a single-entry memoizer; switching dictionaries/levels on the same thread causes a re-parse — tracked in perf(compression): expand TLS compressor/decoder cache to multi-entry keyed map #231
Blob-file dictionary compression remains unsupported — tracked in feat(vlog): dictionary compression for blob files #230
Decompression throughput ~2–3.5× slower than C reference (pre-existing)

Test Plan

`src/compression/zstd_pure.rs` — roundtrip, magic detection, all levels, empty input, raw content dict, capacity guard, in-place strip_dict_id, empty-frame-at-capacity-0 regression
`tests/zstd_dict_roundtrip.rs` — full Tree write/flush/read, range scan, reopen, missing/wrong dict errors, finalized dict, encryption, per-level policy
`tests/zstd_dict_roundtrip.rs` — compaction path: 3 L0 SSTs flushed → `major_compact` → 300 keys readable + range scan (exercises both `compress_with_dict` and `decompress_with_dict` on the compaction hot path)
`tests/zstd_dict_roundtrip.rs` — reopen with wrong dict fails at recovery (`ZstdDictMismatch` on first SST read)
`cargo nextest run --no-default-features --features zstd,lz4`: 1208 passed
`cargo nextest run --all-features`: 1256 passed
`cargo clippy --all-features -- -D warnings`: clean
`cargo clippy --no-default-features --features zstd,lz4 --all-targets -- -D warnings`: clean
CI job `test-zstd` (renamed from `test-zstd-pure`) runs on `[stable, "1.92.0"]`

Closes #218

Summary by CodeRabbit

New Features
- Added support for zstd dictionary compression and decompression (pure-Rust backend).
Documentation
- Clarifies zstd now uses a pure‑Rust backend (no C toolchain), documents performance tradeoffs, current limitations, and marks the old alias as deprecated.
Tests
- Substantially expanded unit and integration tests for dictionary behaviors and round‑trip/compaction scenarios.
Chores
- CI and benchmark workflow updates; dependency and build-script feature wiring adjusted.

- Implement `compress_with_dict()` in `ZstdPureProvider` using `FrameCompressor::set_dictionary()` from `structured-zstd` v0.0.7, which added dictionary encoder support in structured-world/structured-zstd#25 - Support both finalized zstd dictionaries (magic 0x37A430EC + entropy tables, parsed via `decode_dict`) and raw content dictionaries (bare byte sequences, parsed via `from_raw_content` with xxh3-derived ID), matching the transparent handling in the C FFI backend - Remove the config-time guard that rejected `ZstdDict` compression policies under the pure backend — the feature is now fully supported - Update `decompress_with_dict` to apply the same format-detection logic so raw content dicts round-trip correctly end-to-end Unit tests (src/compression/zstd_pure.rs): - compress_with_dict_roundtrip_pure_to_pure - compress_with_dict_produces_zstd_magic - compress_with_dict_roundtrip_all_levels (levels 1, 3, 9, 19) - compress_with_dict_empty_dict_returns_error - compress_with_dict_raw_content_dict_works - compress_with_dict_empty_plaintext_roundtrips Integration tests (tests/zstd_dict_roundtrip.rs, zstd_pure_dict mod): - pure_tree_write_flush_read_zstd_dict (full Tree write→flush→read) - pure_tree_range_scan_with_zstd_dict - pure_tree_reopen_with_dict_reads_back_correctly - pure_zstd_dict_missing_returns_error Closes #218

coderabbitai · 2026-04-07T16:48:52Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 08b341ed-927a-4b05-9a14-033d60327186

📥 Commits

Reviewing files that changed from the base of the PR and between 5f2e3a0 and eb34b7f.

📒 Files selected for processing (1)

src/compression/zstd_pure.rs

📝 Walkthrough

Walkthrough

Replaced the C-FFI zstd backend with a single pure‑Rust structured‑zstd backend; implemented dictionary compression/decompression (finalized vs raw-content) with TLS caching and compressor cache; simplified ZstdDictionary API; removed zstd_ffi; updated Cargo/features, build script, CI, README, benchmarks, and tests.

Changes

Cohort / File(s)	Summary
Feature & dependency wiring `Cargo.toml`, `build.rs`	`zstd` now maps to optional `structured-zstd`; `zstd-pure` aliased to `zstd`; removed separate optional `zstd` dependency entry; build.rs emits `zstd_any` only when `CARGO_FEATURE_ZSTD` is set.
Pure Rust backend (dictionary support) `src/compression/zstd_pure.rs`	Added DICT_MAGIC detection, `strip_dict_id`, bounded raw-content decoding, `do_decompress_with_dict`, full `compress_with_dict` implementation (FrameCompressor), TLS single-entry compressor cache keyed by (xxh3_64(dict_raw), level), raw vs finalized-dict paths, and extensive unit tests.
Compression module & types `src/compression/mod.rs`	Removed C-FFI backend selection; `ZstdBackend` now resolves to pure‑Rust provider under `zstd`; removed prepared decoder field and accessor from `ZstdDictionary`; adjusted docs and `id64` cfg visibility.
Removed C‑FFI backend `src/compression/zstd_ffi.rs`	Deleted the C-FFI zstd backend and its `ZstdFfiProvider` implementation.
Configuration validation `src/config/mod.rs`	`Config::validate_zstd_dictionary()` no longer rejects `ZstdDict` at compile-time under pure-Rust; now computes `dict_id` and validates presence/match for referenced dictionaries.
CI / Coverage / Benchmarks `.github/workflows/coordinode-ci.yml`, `.github/workflows/benchmark.yml`, `benches/zstd_dict.rs`	Added `test-zstd` job and zstd-specific coverage run; updated benchmark checkout to `actions/checkout@v6`; renamed benchmark label to `tls_hit` and adjusted comments.
Docs & README `README.md`	Updated feature-flag docs: `zstd` described as pure‑Rust `structured-zstd` backend (no C deps), performance note added; `zstd-pure` deprecated as alias.
Tests / Integration `tests/zstd_dict_roundtrip.rs`	Added integration test `zstd_dict_survives_major_compaction` verifying Zstd-dict compression survives major compaction and validates reads and range scans.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant ZstdPure as ZstdPureProvider
    participant TLSCache
    participant Compressor as FrameCompressor
    participant Decoder as FrameDecoder

    rect rgba(200,200,255,0.5)
    Caller->>ZstdPure: compress_with_dict(input, dict_raw, level)
    ZstdPure->>TLSCache: lookup (xxh3_64(dict_raw), level)
    TLSCache-->>ZstdPure: compressor (create if missing)
    ZstdPure->>Compressor: compress(input, dict)
    Compressor-->>ZstdPure: compressed_frame
    ZstdPure->>ZstdPure: if raw-content dict → strip_dict_id(frame)
    ZstdPure-->>Caller: compressed_frame
    end

    rect rgba(200,255,200,0.5)
    Caller->>ZstdPure: decompress_with_dict(frame, dict_raw, capacity)
    ZstdPure->>ZstdPure: detect dict format (DICT_MAGIC / raw)
    alt raw-content path
        ZstdPure->>Decoder: init with force_dict + bounded decode
        Decoder-->>ZstdPure: decompressed_bytes / size_error
    else finalized-dict path
        ZstdPure->>Decoder: decode_all_to_vec(finalized)
        Decoder-->>ZstdPure: decompressed_bytes
    end
    ZstdPure-->>Caller: decompressed_bytes / error
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

#218: feat(compression): enable dictionary compression in pure Rust backend — This PR implements pure‑Rust dictionary compression/decompression requested by that issue.
#25: feat: dictionary builder — FastCOVER algorithm and dictionary finalization — The PR consumes finalized-dictionary format expectations (finalized dicts used by the compressor).
#232: (prepared-decoder caching proposal) — The PR replaces the prepared-field with TLS caching keyed by a 64-bit fingerprint, directly addressing the related caching/design surface.

Possibly related PRs

perf(compression): use numeric zstd levels in pure Rust backend #226 — Overlapping changes in the pure‑Rust zstd backend and compressor-level wiring.
feat(compression): CompressionProvider trait + pure Rust zstd backend #176 — Prior work touching compression backend selection and zstd modules.
perf(compression): cache pre-compiled Dictionary across block decompress calls #227 — Related edits to ZstdDictionary shape and TLS cache semantics.

Poem

🐇
I hopped from FFI hollows to structured‑zstd bright,
Pure Rust frames now hum through day and night,
Dicts compress and decompress with care,
TLS caches warm, IDs stripped in the air,
Tests clap their paws — round trips pass just right.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat(compression): enable dictionary compression in pure Rust backend' accurately describes the main change—implementing compress_with_dict() in ZstdPureProvider using structured-zstd.
Linked Issues check	✅ Passed	The PR fully addresses all coding requirements from linked issues: `#25` (FastCOVER/dictionary finalization in upstream structured-zstd, now resolved) and `#218` (implement compress_with_dict in pure backend with round-trip compatibility and integration tests).
Out of Scope Changes check	✅ Passed	All changes are directly related to enabling dictionary compression in the pure Rust backend: structured-zstd bump, C-FFI backend removal, ZstdPureProvider implementation updates, test additions, and CI/build configuration adjustments are all in scope.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/#218-featcompression-enable-dictionary-compression-in-p

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-04-07T16:51:26Z

Codecov Report

❌ Patch coverage is 97.39583% with 5 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/compression/zstd_pure.rs	97.39%	5 Missing ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/mod.rs`:
- Around line 38-43: The documentation for the compression function is too
strict: update the doc comment that currently says "must be a finalized zstd
dictionary" to state that dict_raw accepts both finalized zstd dictionaries
(magic 0x37A430EC header, entropy tables, content) and bare content dictionaries
as produced/returned by ZstdDictionary::new() / ZstdDictionary::raw(); mention
that both the C FFI backend and the pure Rust backend accept both formats so
callers know either form is valid.

In `@src/compression/zstd_pure.rs`:
- Around line 111-118: The frame encoder currently clamps the xxh3 low-32-bit
hash to 1 locally (let id = { xxh3_64(dict_raw) as u32; h.max(1) }) causing
mismatch with ZstdDictionary::id() and CompressionType::ZstdDict; instead
compute a single normalized dict id from the raw hash in the shared
dictionary-ID path (e.g., produce raw_id = (xxh3_64(dict_raw) as u32) and
normalized_id = raw_id.max(1)) and use normalized_id everywhere metadata,
ZstdDictionary::id(), CompressionType::ZstdDict, and frame encoding read/write
IDs; remove the local h.max(1) clamping in the encoder and apply the same change
for the other occurrence around lines 179-183 so both metadata and frame headers
observe the identical normalized id.

In `@tests/zstd_dict_roundtrip.rs`:
- Around line 270-271: The zstd_pure_dict module is gated with #[cfg(all(feature
= "zstd-pure", not(feature = "zstd")))] and will be skipped by cargo test
--all-features, so add a dedicated CI job that runs tests with the pure backend
enabled (e.g., run cargo test --no-default-features --features zstd-pure or
cargo test --features zstd-pure while ensuring the "zstd" feature is not
enabled) so the zstd_pure_dict module is exercised; update your CI workflow to
include this new job (name it clearly, e.g., "test: zstd-pure") and ensure it
uses the same test matrix/platforms as the other test jobs.
- Around line 278-291: The current make_test_dictionary() only wraps raw sample
bytes with ZstdDictionary::new(), so tests never exercise the
finalized-dictionary path; change or add a test fixture that actually
trains/finalizes a dictionary (use the library's training/finalize API instead
of ZstdDictionary::new()), then use that finalized dictionary in an end-to-end
encode/decode test to exercise Dictionary::decode_dict and
FrameCompressor::set_dictionary_from_bytes; locate make_test_dictionary and
replace or add a make_finalized_test_dictionary that invokes the training
function (or calls the dictionary finalization method) on the same samples and
return the finalized ZstdDictionary for the new test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9f444fc6-12f8-4148-ae09-8969ce781690

📥 Commits

Reviewing files that changed from the base of the PR and between ab61d33 and 746a513.

📒 Files selected for processing (5)

README.md
src/compression/mod.rs
src/compression/zstd_pure.rs
src/config/mod.rs
tests/zstd_dict_roundtrip.rs

💤 Files with no reviewable changes (2)

README.md
src/config/mod.rs

Copilot

Pull request overview

This PR enables zstd dictionary compression for the zstd-pure backend (structured-zstd), removing the previous config-time rejection and adding unit + integration tests to exercise dictionary-compressed Tree roundtrips under the pure backend.

Changes:

Implement compress_with_dict() for ZstdPureProvider, including finalized-vs-raw dictionary format detection.
Remove the Config::open() guard that previously rejected CompressionType::ZstdDict under zstd-pure.
Add new unit tests (src/compression/zstd_pure.rs) and integration tests (tests/zstd_dict_roundtrip.rs) for pure-backend dictionary compression.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/compression/zstd_pure.rs`	Implements pure-backend dictionary compression and expands unit test coverage.
`src/config/mod.rs`	Removes the pure-backend dictionary-compression rejection during config validation.
`src/compression/mod.rs`	Updates trait-level docs to reflect dictionary support in the pure backend.
`tests/zstd_dict_roundtrip.rs`	Adds integration tests for dict compression via the pure backend through the Tree API.
`README.md`	Updates documentation by removing the old “dict compression unsupported” limitation for `zstd-pure`.

…-zero - Move .max(1) into ZstdDictionary::id() so config validation, stored metadata, and zstd frame headers all observe the same dict id (id=0 is invalid in the zstd frame format; edge case when xxh3 lower 32 bits are zero) - Remove redundant .max(1) from decompress_with_dict raw-content path - Update compress_with_dict trait doc to reflect that both finalized and raw-content dictionaries are accepted by both backends - Add dedicated test-zstd-pure CI job to exercise the pure backend independently (--all-features enables C zstd and skips the not(feature = "zstd") gate) - Add pure_finalized_dict_roundtrip integration test to cover the Dictionary::decode_dict path in the pure Rust backend - Correct make_test_dictionary() doc: builds a raw-content dict, not a finalized-dictionary fixture

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/coordinode-ci.yml:
- Around line 62-80: Add a new step in the existing test-zstd-pure job to run
clippy under the pure-only feature combo: insert a step before the "Run tests"
step that runs `cargo clippy --no-default-features --features zstd-pure,lz4
--all-targets -- -D warnings`; this ensures code behind #[cfg(all(feature =
"zstd-pure", not(feature = "zstd")))] (e.g. src/compression/zstd_pure.rs and
tests/zstd_dict_roundtrip.rs) is linted and won't introduce warnings.

In `@src/compression/mod.rs`:
- Around line 158-176: The current normalization in ZstdDictionary::id() forces
id=0 to 1 and breaks read compatibility: revert or limit that normalization so
existing stored metadata with dict_id == 0 can still be recognized;
specifically, keep ZstdDictionary::id64() unchanged, change ZstdDictionary::id()
to return the raw lower 32 bits (including 0) for read/validation paths and
perform the clamp-to-1 only in write/encoding code paths that emit new frame
headers (i.e., where CompressionType::ZstdDict is serialized), or alternatively
make validation logic that compares dictionaries accept both raw 0 and
normalized 1 as equivalent for the same dictionary; update the code paths that
currently call ZstdDictionary::id() for config validation and block/table
open-time checks to use the raw value or the tolerant comparison, and add a
regression test that writes metadata with dict_id == 0 and verifies it can be
read back without triggering ZstdDictMismatch.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d7702fe5-b3ec-47af-9ebb-251245a313de

📥 Commits

Reviewing files that changed from the base of the PR and between 746a513 and aacc1f1.

📒 Files selected for processing (4)

.github/workflows/coordinode-ci.yml
src/compression/mod.rs
src/compression/zstd_pure.rs
tests/zstd_dict_roundtrip.rs

- Add TLS FrameCompressor cache to compress_with_dict in pure Rust backend, matching the existing FrameDecoder TLS cache in decompress_with_dict - Cache key: (xxh3_64(dict_raw), level); dictionary is parsed at most once per thread per distinct (dict, level) pair - Use owned Cursor<Vec<u8>> source and Vec<u8> drain for 'static TLS compat - Revert ZstdDictionary::id() to return raw lower-32-bits of xxh3 (no clamping); backends that embed dict_id in the zstd frame header clamp to .max(1) themselves to preserve backward compat with on-disk dict_id=0 (theoretical) - Update id() doc to clarify that clamping is the backend's responsibility - Restore dict.id().max(1) in decompress_with_dict raw-content path - Add cargo clippy step for pure backend in test-zstd-pure CI job - Update benchmark.yml actions/checkout from v4 to v6

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/compression/mod.rs:178

PR description says ZstdDictionary::id() is normalized to always return ≥1, but the current implementation returns the raw lower 32 bits of xxh3 and the docs explicitly note it may be 0. Since id=0 is reserved/invalid in zstd frame headers, leaving id() able to return 0 means on-disk CompressionType::ZstdDict{dict_id} can be 0 while the pure backend clamps the frame header dict ID to ≥1 for raw-content dictionaries. Please either implement the promised normalization (and adjust docs/tests accordingly) or update the PR description to match the intended semantics.

    /// Returns a 32-bit dictionary fingerprint (lower 32 bits of xxh3).
    ///
    /// Intended for config validation (matching a `CompressionType::ZstdDict`
    /// `dict_id` against the supplied `ZstdDictionary`) and external interop.
    ///
    /// The value is the raw lower 32 bits of xxh3 and may theoretically be `0`
    /// (probability ≈ 1/2³²). Backends that embed a dict ID in the zstd frame
    /// header (where id=0 is reserved) are responsible for clamping to at
    /// least 1 themselves. Config validation is unaffected: both sides derive
    /// the ID from the same bytes and therefore agree even in the zero case.
    ///
    /// For internal cache keying use [`id64`](ZstdDictionary::id64) to avoid
    /// hash collisions.
    #[must_use]
    #[expect(
        clippy::cast_possible_truncation,
        reason = "intentional: public API returns 32-bit fingerprint"
    )]
    pub fn id(&self) -> u32 {
        self.id as u32
    }

- Fix module-level doc: remove reference to set_dictionary_from_bytes (implementation uses set_dictionary with a parsed Dictionary) - Reuse the exhausted Cursor<Vec<u8>> source buffer across compress() calls via take_source().map_or_else(...) to avoid a per-block O(data.len()) allocation after the first call on a given thread; drain ownership is transferred to the caller so its capacity cannot be recovered without an extra copy

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/zstd_pure.rs`:
- Line 130: Extract the duplicated DICT_MAGIC array into a single module-level
constant and use it wherever the inline definition appears: add a top-level
const DICT_MAGIC: [u8; 4] = [0x37, 0xA4, 0x30, 0xEC] (with doc comment) near the
file imports, then remove the inline definitions in the functions (e.g., the
current local DICT_MAGIC in decompress_with_dict) and replace their uses with
the new module-level DICT_MAGIC constant so the magic value is defined once.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9739d26c-0a93-4be1-baaf-f423a58fe9c8

📥 Commits

Reviewing files that changed from the base of the PR and between 7650e6b and 1aec535.

📒 Files selected for processing (1)

src/compression/zstd_pure.rs

polaz · 2026-04-07T19:39:57Z

@coderabbitai full review

coderabbitai · 2026-04-07T19:40:04Z

✅ Actions performed

Full review triggered.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/zstd_pure.rs`:
- Around line 146-159: The test suite is missing a cross-backend regression test
for raw-content dictionaries: add a test that creates a raw-content dictionary
(use Dictionary::from_raw_content with the synthetic id logic based on dict_key
and DICT_MAGIC/Dictionary::decode_dict semantics), compresses data using the
pure backend, then attempts to decompress the resulting frame using the FFI
backend to assert successful round-trip; ensure the test exercises the path
where dict_raw does not start with DICT_MAGIC so the synthetic id branch (the id
computed from dict_key) is used, and verify decompression succeeds despite the
differing dictID treatment between backends.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 027cfd5b-791d-4f90-a206-b3c0f3d2f603

📥 Commits

Reviewing files that changed from the base of the PR and between ab61d33 and 59bd591.

📒 Files selected for processing (7)

.github/workflows/benchmark.yml
.github/workflows/coordinode-ci.yml
README.md
src/compression/mod.rs
src/compression/zstd_pure.rs
src/config/mod.rs
tests/zstd_dict_roundtrip.rs

💤 Files with no reviewable changes (2)

README.md
src/config/mod.rs

The pure Rust backend (structured-zstd) rejects dictID=0 for Dictionary::from_raw_content, so it internally assigns a synthetic non-zero ID derived from xxh3_64(dict_raw). The C zstd library always records dictID=0 (absent) for raw-content dicts, causing two failures: pure → FFI: pure embeds synthetic dictID in frame header; libzstd ZSTD_DDict has id=0; decompressor checks dctx->dictID(0) != fParams.dictID(synthetic) → "Dictionary mismatch". FFI → pure: FFI records dictID=0 in frame; structured-zstd treats dictID=0 as "no dict required" and skips dict lookup, then fails decompression with NotEnoughBytesInDictionary. Fix the pure backend in both directions: compress_with_dict (raw-content): after compression, strip the Dict_ID field from the zstd frame header via strip_dict_id(). The synthetic ID is an internal detail — removing it aligns the output with the C convention (dictID absent/0), so the C FFI decompressor skips the ID check and accepts the frame. decompress_with_dict (raw-content): replace decode_all_to_vec (which calls init() internally and skips dict loading when dictID=0) with the manual init() → force_dict() → decode_blocks() → collect() flow. force_dict() loads the raw-content dict regardless of the frame's dictID field, handling all three cases: - Frame from C FFI backend (dictID absent): force_dict loads dict. - Frame from new pure backend (dictID stripped): same. - Frame from old pure backend (dictID=synthetic): force_dict reloads same dict (idempotent). Add cross-backend interoperability tests gated on #[cfg(all(test, feature = "zstd", feature = "zstd-pure"))]: pure_compress_ffi_decompress_raw_content_dict_roundtrip ffi_compress_pure_decompress_raw_content_dict_roundtrip mod zstd_pure is now compiled under #[cfg(feature = "zstd-pure")] (previously not(feature = "zstd")) so it is visible to the cross-backend test module when --all-features is used. Closes #218

sw-release-bot

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.

Benchmark suite	Current: `ffd9512`	Previous: `ab61d33`	Ratio
`mergerandom`	`261870.7610778383` ops/sec (normalized)	`380260.6261186645` ops/sec (normalized)	`1.45`

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/zstd_pure.rs`:
- Around line 151-174: The TLS variables (e.g., TLS_COMPRESSOR holding
Option<(u64, i32, CachedCompressor)>) implement a single-entry memoizer contrary
to the comment; replace the Option with a small keyed cache (HashMap or tiny
LRU) keyed by (dict_hash: u64, level: i32) so a thread can store multiple
CachedCompressor entries and avoid re-parsing when switching
dictionaries/levels; update the thread_local! RefCell<Option<...>> declarations
to RefCell<HashMap<(u64, i32), CachedCompressor>> or a bounded LRU map and apply
the same change to the analogous TLS cells referenced in the diff (the decoder
TLS instances), ensuring lookups insert/evict correctly and keys use
dict.id64()/level as described in the comments.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6b0df8e2-09ae-4eda-a5fd-d5d8f7ca0431

📥 Commits

Reviewing files that changed from the base of the PR and between 59bd591 and 9ce1517.

📒 Files selected for processing (2)

src/compression/mod.rs
src/compression/zstd_pure.rs

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

src/compression/zstd_pure.rs (1)
344-345: ⚠️ Potential issue | 🟡 Minor

Keep the finalized-dictionary magic wording consistent.

These comments still spell the header as 0x37A430EC, which reintroduces the endian confusion you already fixed for DICT_MAGIC. Reword them to reference the byte sequence (37 A4 30 EC) or the corrected little-endian value instead.

Also applies to: 499-503
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/compression/zstd_pure.rs` around lines 344 - 345, Update the
finalized-dictionary comment wording so it does not use the big-endian hex
literal 0x37A430EC (which causes endian confusion); instead reference the byte
sequence "37 A4 30 EC" or the corrected little-endian numeric value wherever the
header is described (e.g., the comments near DICT_MAGIC and the
finalized-dictionary descriptions in src/compression/zstd_pure.rs, including the
other occurrence around the block that spans lines ~499-503). Ensure both places
use the same phrasing ("byte sequence 37 A4 30 EC" or the corrected
little-endian value) so the comment is unambiguous across platforms and matches
the DICT_MAGIC constant.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/coordinode-ci.yml:
- Around line 122-124: The coverage comment on the workflow step that runs
`cargo +nightly llvm-cov --no-report nextest --no-default-features --features
zstd,lz4` is misleading because `--all-features` now includes the `zstd` backend
(since `zstd-pure` is an alias), so update the comment to state that this extra
run validates a narrower feature combination (zstd and lz4 without default
features) rather than exercising dictionary-compression paths missed by
`--all-features`; keep the note that the extra run is intentional for validating
that specific feature set and remove the incorrect claim about exercising
dictionary paths.

In `@src/compression/mod.rs`:
- Around line 28-34: Update the doc comment for the compression function to
avoid implying ZstdDictionary::raw() only yields a finalized dictionary:
rephrase the sentence under "Compress `data` using a zstd dictionary." so it
states that `dict_raw` may be either a finalized zstd dictionary (magic
0x37A430EC, entropy tables, content — produced by `zstd --train`) or raw content
bytes, and that callers may pass bytes produced by `zstd --train` or bytes
obtained from `ZstdDictionary::raw()`; make clear the zstd backend accepts
either representation. Target the doc block surrounding the function that begins
with "Compress `data` using a zstd dictionary." and adjust the wording
accordingly.

In `@src/compression/zstd_pure.rs`:
- Around line 179-183: Replace the misleading "// SAFETY:" comment above the
slice access with "// Invariant:" because this is documenting a safe
slice-bounds invariant (not reasoning for an unsafe block); update the comment
that precedes the call to output.get_mut(prev_len..) / unwrap_or_else(...) so it
reads "// Invariant: ..." and explains that output was resized to new_len and
prev_len.. is a valid slice (referencing variables output, prev_len, new_len,
and dest to locate the code).

---

Duplicate comments:
In `@src/compression/zstd_pure.rs`:
- Around line 344-345: Update the finalized-dictionary comment wording so it
does not use the big-endian hex literal 0x37A430EC (which causes endian
confusion); instead reference the byte sequence "37 A4 30 EC" or the corrected
little-endian numeric value wherever the header is described (e.g., the comments
near DICT_MAGIC and the finalized-dictionary descriptions in
src/compression/zstd_pure.rs, including the other occurrence around the block
that spans lines ~499-503). Ensure both places use the same phrasing ("byte
sequence 37 A4 30 EC" or the corrected little-endian value) so the comment is
unambiguous across platforms and matches the DICT_MAGIC constant.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6ec60988-727d-4765-8fed-1ff6a08976e0

📥 Commits

Reviewing files that changed from the base of the PR and between ab61d33 and ef55325.

📒 Files selected for processing (11)

.github/workflows/benchmark.yml
.github/workflows/coordinode-ci.yml
Cargo.toml
README.md
benches/zstd_dict.rs
build.rs
src/compression/mod.rs
src/compression/zstd_ffi.rs
src/compression/zstd_pure.rs
src/config/mod.rs
tests/zstd_dict_roundtrip.rs

💤 Files with no reviewable changes (2)

src/config/mod.rs
src/compression/zstd_ffi.rs

…issues - Standardize '0x37A430EC' → bytes '37 A4 30 EC' in all comments (mod.rs compress_with_dict, zstd_pure.rs compress/decompress paths) - mod.rs compress_with_dict: clarify ZstdDictionary::raw() is an accessor, not a dict producer; mention it's accessible for persistence/interop - zstd_pure.rs decode_raw_content_bounded: rename '// SAFETY:' → '// Invariant:' (not an unsafe block, documents safe slice-bounds invariant) - zstd_dict_roundtrip.rs: fix assertion message 'L0 SSTs' → 'tables' (table_count() returns total across all levels, not L0-only) - coordinode-ci.yml: correct coverage comment — zstd-pure is now an alias for zstd, so the extra run validates a narrower feature set, not dict paths missed by --all-features

polaz · 2026-04-08T22:46:48Z

@coderabbitai re-review

coderabbitai · 2026-04-08T22:47:04Z

@polaz Sure! Triggering a fresh re-review to pick up any remaining issues after the latest round of fixes.

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:165-176
Timestamp: 2026-04-08T16:02:08.879Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), the `decode_raw_content_bounded` loop's early-return guard was fixed by replacing `if remaining == 0` with `if remaining.max(1) == 0` (effectively: `remaining > 0` is no longer required before calling `decode_blocks`). The root cause: for empty-plaintext frames with raw-content dictionaries the FCS is 0, so the decompression-bomb pre-check in `do_decompress_with_dict` is skipped, and `decode_raw_content_bounded` is entered with `capacity=0`. On the first iteration `!decoder.is_finished()` is true (no blocks decoded yet) and the old `remaining == 0` guard fired prematurely, returning `DecompressedSizeTooLarge` for a legitimate empty frame. The fix allows the decoder to advance past the empty `Last_Block` before checking capacity. Regression tests `..._at_capacity_one` and `..._at_exact_capacity` cover this path. Do not re-flag the `remaining == 0` check as a bug in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-08T21:33:42.308Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), both TLS caches (`TLS_COMPRESSOR` and `TLS_DECODER`) are intentionally single-entry memoizers keyed by `(xxh3_64(dict_raw), level)`. They evict-and-replace on a dictionary/level change rather than growing a keyed map. This is sufficient because an LSM-tree compaction job uses one active dictionary per thread. A multi-entry keyed cache for workloads interleaving multiple dicts per thread is a valid follow-up optimization tracked in issue `#231`. Do not flag the single-entry TLS design as missing a multi-entry keyed cache in future reviews of this repository.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:151-174
Timestamp: 2026-04-07T21:55:23.591Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), both TLS caches (`TLS_COMPRESSOR` and `TLS_DECODER`) are intentionally single-entry memoizers: each stores `Option<(key, ..., compressor/decoder)>` and evicts-and-replaces on a dictionary/level change rather than growing a keyed map. This is sufficient because an LSM-tree typically uses one active dictionary. Do not flag the single-entry TLS design as missing a multi-entry keyed cache in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:151-174
Timestamp: 2026-04-07T21:55:23.591Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), the raw-content dict decompression path uses a `content_size()` decompression-bomb guard (added in the same commit as the TLS comment fix) to limit output size before allocating, consistent with the finalized-dict path. Do not flag the absence of this guard in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:146-159
Timestamp: 2026-04-07T20:57:55.760Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), cross-backend raw-content dictionary interop was fixed in commit 9ce15176:
- Pure → FFI: after `FrameCompressor::compress()` for a raw-content dictionary, `strip_dict_id()` is called on the output frame to remove the synthetic dictID (which structured-zstd requires internally but must not appear on the wire for C-backend compatibility).
- FFI → pure: `decompress_with_dict` for raw-content dicts now uses the manual `init() → force_dict() → decode_blocks() → collect()` flow instead of `decode_all_to_vec()`. `force_dict` loads the dictionary unconditionally regardless of the frame's dictID field, handling FFI-produced frames (dictID=0), new pure frames (dictID stripped), and old pure frames (dictID=synthetic). Do not flag the absence of `decode_all_to_vec` for raw-content decompression in future reviews.
- Regression tests live in `cross_backend_interop_tests` under `#[cfg(all(test, feature = "zstd", feature = "zstd-pure"))]`.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 202
File: src/table/tests.rs:2113-2118
Timestamp: 2026-04-02T23:53:51.122Z
Learning: In `src/table/block_index/iter.rs` and `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `trim_back_to_upper_bound` contains an unconditional restore (the `else { true }` branch at lines ~787-788 of decoder.rs) that always re-pushes a covering entry onto `hi_scanner.stack` when the stack would otherwise empty. This means `seek_upper_bound_cursor` always returns `true` for well-formed block data, regardless of `restart_interval`. Consequently, the `OwnedIndexBlockIter::from_block_with_bounds` returning `Ok(None)` (empty-child path) cannot be forced from test configuration (e.g., `use_index_block_restart_interval(4)`) on non-corrupt data — it is only reachable via `fill_stack`/`advance_upper_restart_interval` decode failure. Regression tests for the two-level index empty-child skip behavior (PR `#202`, issue `#194`) validate overall iteration correctness defensively, not the exact `Ok(None)` corruption path. Do not suggest forcing this path via `restart_interval` changes in reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T22:48:06.079Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the "binary search trusts semantically-corrupted binary index" concern (lines 245-299) is explicitly out of scope for PR `#182`. The scenario requires a corrupted binary index that passes the block checksum (encoder bug or checksum collision). Adding O(n) monotonicity validation per seek call would regress the hot path for a case already caught by checksums. This hardening is tracked in issue `#184` (fallible decoder constructor). Do not flag the absence of monotonicity validation on the binary-index seek path in reviews of PR `#182` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-08T18:47:20.584Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), `do_decompress_with_dict` accepts a pre-computed `raw_content_id: u32` parameter (callers pass `dict.id().max(1)` from the cached `ZstdDictionary` handle) to avoid re-hashing `dict_raw` on every decompression call. Do not flag the absence of an inline `xxh3_64(dict_raw)` call inside `do_decompress_with_dict` as a missing ID derivation in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-27T07:10:44.958Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the restart-key span validation is intentionally split: strict bounds hardening (buffer/offset checks) was implemented in PR `#182`, while full semantic restart-key span validation (which requires passing the restart-key span through decoder state) is tracked separately in issue `#183`. Do not flag the absence of semantic-span validation in decoder.rs as an unaddressed gap in reviews of PR `#182` or earlier code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T14:37:26.098Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the `unwrap!` calls on `read_u8`/`read_u32` inside `Decoder::try_new()` (lines ~139–151) will panic on a truncated trailer slice. This is guarded at a higher level: `Block::from_reader` and checksum validation reject undersized blocks before they reach the decoder, and `Trailer::trailer_offset()` would underflow on undersized blocks — making this a pre-existing pipeline invariant, not introduced by PR `#191`. Making `Trailer::as_slice()` and its downstream reads fallible to handle truncated/undersized blocks gracefully is tracked in issue `#192`. Do not re-flag the `unwrap!` reads in `Decoder::try_new()` as an unaddressed gap in reviews of PR `#191` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.867Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 182
File: src/table/block/decoder.rs:236-243
Timestamp: 2026-03-30T17:47:54.258Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `get_key_at` calls `Item::parse_full` on every binary-search probe inside `partition_point` / `partition_point_2` as a result of the bounds-hardening in PR `#182` (which replaced the previous unsafe `get_unchecked` + `parse_restart_key` path). Threading `entries_end` into `parse_restart_key()` to restore a lighter bounded fast path is tracked in issue `#190`. Do not flag the full-item decode in `get_key_at` as an unaddressed gap in reviews of PR `#182` or later code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T04:38:24.388Z
Learning: In `src/table/index_block/block_handle.rs` (structured-world/coordinode-lsm-tree), `parse_full` bounds hardening, `#[must_use]` annotations on parser methods, and remaining `value_type` unwrap paths in `parse_full` are pre-existing code outside the diff of PR `#182`. The broader hardening (including a fallible decoder constructor) is tracked in issue `#184`. Do not flag these as unaddressed gaps in reviews of PR `#182`; they are known, tracked follow-ups.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T10:23:54.480Z
Learning: In `src/manifest.rs` (structured-world/coordinode-lsm-tree), both manifest test helpers (`write_test_manifest` and the related decode helper) were updated in PR `#212` (commit 61016156) with doc comments explaining why all four sfa sections (format_version, tree_type, level_count, filter_hash_type) must be written — the sfa archive format requires all sections to be present for a valid manifest. Do not flag the absence of this explanation in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.461Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all 5 GC unit tests were updated in PR `#212` to seed distinct `v{id}` files (IDs 1–4) into MemFs before calling `maintenance`, and to assert that the correct files are deleted or preserved after the call. Do not flag the absence of seeded v* files in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T16:07:14.590Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), after commit 40361df in PR `#191`, `Decoder::try_new` returns `Err(crate::Error::InvalidTrailer)` when `compute_entries_end()` returns `None` (invalid binary/hash-index layout), instead of silently accepting the block with `cached_entries_end = None`. Do not flag the absence of this check in reviews of PR `#191` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-06T10:54:29.157Z
Learning: In `Cargo.toml` (structured-world/coordinode-lsm-tree, PR `#214`), the `structured-zstd` dependency was updated to version `0.0.7` (latest as of the update). Version `0.0.6` (used earlier in the PR) was always a valid published version. Do not flag either version as invalid in reviews of PR `#214`.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

polaz · 2026-04-08T23:03:53Z

@coderabbitai re-review

coderabbitai · 2026-04-08T23:04:07Z

@polaz Sure! Triggering a fresh re-review to pick up any remaining issues after the latest round of fixes.

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:165-176
Timestamp: 2026-04-08T16:02:08.879Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), the `decode_raw_content_bounded` loop's early-return guard was fixed by replacing `if remaining == 0` with `if remaining.max(1) == 0` (effectively: `remaining > 0` is no longer required before calling `decode_blocks`). The root cause: for empty-plaintext frames with raw-content dictionaries the FCS is 0, so the decompression-bomb pre-check in `do_decompress_with_dict` is skipped, and `decode_raw_content_bounded` is entered with `capacity=0`. On the first iteration `!decoder.is_finished()` is true (no blocks decoded yet) and the old `remaining == 0` guard fired prematurely, returning `DecompressedSizeTooLarge` for a legitimate empty frame. The fix allows the decoder to advance past the empty `Last_Block` before checking capacity. Regression tests `..._at_capacity_one` and `..._at_exact_capacity` cover this path. Do not re-flag the `remaining == 0` check as a bug in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-08T21:33:42.308Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), both TLS caches (`TLS_COMPRESSOR` and `TLS_DECODER`) are intentionally single-entry memoizers keyed by `(xxh3_64(dict_raw), level)`. They evict-and-replace on a dictionary/level change rather than growing a keyed map. This is sufficient because an LSM-tree compaction job uses one active dictionary per thread. A multi-entry keyed cache for workloads interleaving multiple dicts per thread is a valid follow-up optimization tracked in issue `#231`. Do not flag the single-entry TLS design as missing a multi-entry keyed cache in future reviews of this repository.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:151-174
Timestamp: 2026-04-07T21:55:23.591Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), both TLS caches (`TLS_COMPRESSOR` and `TLS_DECODER`) are intentionally single-entry memoizers: each stores `Option<(key, ..., compressor/decoder)>` and evicts-and-replaces on a dictionary/level change rather than growing a keyed map. This is sufficient because an LSM-tree typically uses one active dictionary. Do not flag the single-entry TLS design as missing a multi-entry keyed cache in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:151-174
Timestamp: 2026-04-07T21:55:23.591Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), the raw-content dict decompression path uses a `content_size()` decompression-bomb guard (added in the same commit as the TLS comment fix) to limit output size before allocating, consistent with the finalized-dict path. Do not flag the absence of this guard in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 229
File: src/compression/zstd_pure.rs:146-159
Timestamp: 2026-04-07T20:57:55.760Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), cross-backend raw-content dictionary interop was fixed in commit 9ce15176:
- Pure → FFI: after `FrameCompressor::compress()` for a raw-content dictionary, `strip_dict_id()` is called on the output frame to remove the synthetic dictID (which structured-zstd requires internally but must not appear on the wire for C-backend compatibility).
- FFI → pure: `decompress_with_dict` for raw-content dicts now uses the manual `init() → force_dict() → decode_blocks() → collect()` flow instead of `decode_all_to_vec()`. `force_dict` loads the dictionary unconditionally regardless of the frame's dictID field, handling FFI-produced frames (dictID=0), new pure frames (dictID stripped), and old pure frames (dictID=synthetic). Do not flag the absence of `decode_all_to_vec` for raw-content decompression in future reviews.
- Regression tests live in `cross_backend_interop_tests` under `#[cfg(all(test, feature = "zstd", feature = "zstd-pure"))]`.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 202
File: src/table/tests.rs:2113-2118
Timestamp: 2026-04-02T23:53:51.122Z
Learning: In `src/table/block_index/iter.rs` and `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `trim_back_to_upper_bound` contains an unconditional restore (the `else { true }` branch at lines ~787-788 of decoder.rs) that always re-pushes a covering entry onto `hi_scanner.stack` when the stack would otherwise empty. This means `seek_upper_bound_cursor` always returns `true` for well-formed block data, regardless of `restart_interval`. Consequently, the `OwnedIndexBlockIter::from_block_with_bounds` returning `Ok(None)` (empty-child path) cannot be forced from test configuration (e.g., `use_index_block_restart_interval(4)`) on non-corrupt data — it is only reachable via `fill_stack`/`advance_upper_restart_interval` decode failure. Regression tests for the two-level index empty-child skip behavior (PR `#202`, issue `#194`) validate overall iteration correctness defensively, not the exact `Ok(None)` corruption path. Do not suggest forcing this path via `restart_interval` changes in reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T22:48:06.079Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the "binary search trusts semantically-corrupted binary index" concern (lines 245-299) is explicitly out of scope for PR `#182`. The scenario requires a corrupted binary index that passes the block checksum (encoder bug or checksum collision). Adding O(n) monotonicity validation per seek call would regress the hot path for a case already caught by checksums. This hardening is tracked in issue `#184` (fallible decoder constructor). Do not flag the absence of monotonicity validation on the binary-index seek path in reviews of PR `#182` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-08T18:47:20.584Z
Learning: In `src/compression/zstd_pure.rs` (structured-world/coordinode-lsm-tree, PR `#229`), `do_decompress_with_dict` accepts a pre-computed `raw_content_id: u32` parameter (callers pass `dict.id().max(1)` from the cached `ZstdDictionary` handle) to avoid re-hashing `dict_raw` on every decompression call. Do not flag the absence of an inline `xxh3_64(dict_raw)` call inside `do_decompress_with_dict` as a missing ID derivation in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-27T07:10:44.958Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the restart-key span validation is intentionally split: strict bounds hardening (buffer/offset checks) was implemented in PR `#182`, while full semantic restart-key span validation (which requires passing the restart-key span through decoder state) is tracked separately in issue `#183`. Do not flag the absence of semantic-span validation in decoder.rs as an unaddressed gap in reviews of PR `#182` or earlier code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T14:37:26.098Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the `unwrap!` calls on `read_u8`/`read_u32` inside `Decoder::try_new()` (lines ~139–151) will panic on a truncated trailer slice. This is guarded at a higher level: `Block::from_reader` and checksum validation reject undersized blocks before they reach the decoder, and `Trailer::trailer_offset()` would underflow on undersized blocks — making this a pre-existing pipeline invariant, not introduced by PR `#191`. Making `Trailer::as_slice()` and its downstream reads fallible to handle truncated/undersized blocks gracefully is tracked in issue `#192`. Do not re-flag the `unwrap!` reads in `Decoder::try_new()` as an unaddressed gap in reviews of PR `#191` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.867Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 182
File: src/table/block/decoder.rs:236-243
Timestamp: 2026-03-30T17:47:54.258Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `get_key_at` calls `Item::parse_full` on every binary-search probe inside `partition_point` / `partition_point_2` as a result of the bounds-hardening in PR `#182` (which replaced the previous unsafe `get_unchecked` + `parse_restart_key` path). Threading `entries_end` into `parse_restart_key()` to restore a lighter bounded fast path is tracked in issue `#190`. Do not flag the full-item decode in `get_key_at` as an unaddressed gap in reviews of PR `#182` or later code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T04:38:24.388Z
Learning: In `src/table/index_block/block_handle.rs` (structured-world/coordinode-lsm-tree), `parse_full` bounds hardening, `#[must_use]` annotations on parser methods, and remaining `value_type` unwrap paths in `parse_full` are pre-existing code outside the diff of PR `#182`. The broader hardening (including a fallible decoder constructor) is tracked in issue `#184`. Do not flag these as unaddressed gaps in reviews of PR `#182`; they are known, tracked follow-ups.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-28T13:41:55.184Z
Learning: In `src/table/index_block/block_handle.rs` (structured-world/coordinode-lsm-tree), the `parse_truncated` function now validates `shared_prefix_len` against both the entry-start boundary AND the restart-key span (`base_key_end`). This was fixed in PR `#182` commit `f64f90e` (originally tracked as follow-up issue `#186`). The function signature was updated to accept a `base_key_end: usize` parameter, and the prefix-end check was changed from `prefix_end > offset` to `prefix_end > base_key_end`. Do not flag the absence of this restart-key span check in reviews of PR `#182` or later code — it is fully implemented.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T10:23:54.480Z
Learning: In `src/manifest.rs` (structured-world/coordinode-lsm-tree), both manifest test helpers (`write_test_manifest` and the related decode helper) were updated in PR `#212` (commit 61016156) with doc comments explaining why all four sfa sections (format_version, tree_type, level_count, filter_hash_type) must be written — the sfa archive format requires all sections to be present for a valid manifest. Do not flag the absence of this explanation in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.461Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all 5 GC unit tests were updated in PR `#212` to seed distinct `v{id}` files (IDs 1–4) into MemFs before calling `maintenance`, and to assert that the correct files are deleted or preserved after the call. Do not flag the absence of seeded v* files in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-06T10:54:29.157Z
Learning: In `Cargo.toml` (structured-world/coordinode-lsm-tree, PR `#214`), the `structured-zstd` dependency was updated to version `0.0.7` (latest as of the update). Version `0.0.6` (used earlier in the PR) was always a valid published version. Do not flag either version as invalid in reviews of PR `#214`.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.237Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), `SuperVersions::maintenance` GC loop was fixed in PR `#212` (commit ecfa47c8) to eliminate the TOCTOU between `fs.exists()` and `fs.remove_file()`. The canonical fix calls `fs.remove_file()` directly and treats `ErrorKind::NotFound` as success. Do not flag the absence of a preceding `exists()` guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T10:11:17.013Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), the `table_count` and `blob_file_count` corruption unit tests were updated in PR `#212` (commit 49a23065) to assert `matches!(err, crate::Error::Unrecoverable)` instead of generic `is_err()`. This pins the exact error variant returned by the bounds-check path. Do not flag the absence of this precise assertion in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.237Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all maintenance-function unit tests were updated in PR `#212` (commit ecfa47c8) to use `MemFs` instead of `Path::new(".")` + `StdFs`, making them fully isolated from the real filesystem. Do not flag real-filesystem usage in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:13:22.289Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the `tree_reopen_rejects_non_utf8_filename_in_data_dir` test was updated in PR `#212` (commit 77785031) to return `lsm_tree::Result<()>` and propagate errors with `?` instead of calling `unwrap()` on I/O paths. This satisfies the repo-wide no-`unwrap()`-on-I/O-paths rule. Do not flag `unwrap()` usage on I/O paths in that test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:48:39.014Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `cleanup_orphaned_version` was updated in PR `#212` (commit 6043501b) so that `remove_file` treats `ErrorKind::NotFound` as success, mirroring the idempotent-delete pattern used in `SuperVersions::maintenance`. Do not flag the absence of a `NotFound` guard in `cleanup_orphaned_version` in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.461Z
Learning: In `src/config/mod.rs` (structured-world/coordinode-lsm-tree), the `with_fs` and `with_shared_fs` constructor doc comments were updated (beyond the `Config::fs` field doc) in PR `#212` to remove stale "reopen limitation" notes, reflecting that MemFs-backed trees can now be fully reopened. Do not flag those constructor docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/zstd_dict_roundtrip.rs`:
- Around line 264-294: Add an explicit postcondition after calling
tree.major_compact(...) to assert that compaction actually reduced L0 tables (or
total SST count): query the tree for its table/level state (e.g., use whatever
API your tree exposes such as a method that returns number of tables or tables
per level, e.g., tree.num_tables_in_level(0) or
tree.table_count()/tree.levels()) and assert that L0 is zero or the total table
count decreased compared to pre-compaction; place this check immediately after
the major_compact call (before the read/scan assertions) and fail the test if
compaction did not occur.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2925a12a-264b-4b96-86eb-0520641c0045

📥 Commits

Reviewing files that changed from the base of the PR and between ef55325 and c9c7ddf.

📒 Files selected for processing (4)

.github/workflows/coordinode-ci.yml
src/compression/mod.rs
src/compression/zstd_pure.rs
tests/zstd_dict_roundtrip.rs

…t test Without this check, if major_compact() ever regresses to a no-op that still returns Ok(()), the test would pass against the original L0 tables without exercising the dict compress/decompress compaction hot path.

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/compression/mod.rs`:
- Around line 110-111: The doc comment for the handle is incorrect: `new()`
stores the full 64-bit xxh3 fingerprint in `self.id` (see `id64()`), while
`id()` truncates to the lower 32 bits; update the comment to state that the
handle stores the full 64‑bit xxh3 hash of `raw` and that `id()` returns the
lower 32 bits on demand (referencing `ZstdDictionary::id` and `id64()` for
clarity).

In `@tests/zstd_dict_roundtrip.rs`:
- Around line 269-272: The test currently masks failures by calling
tree.level_table_count(0).unwrap_or(0); instead, call level_table_count(0) and
assert on its success and result explicitly: retrieve the Result via
.expect(...) or .unwrap() with a clear message (e.g., on the call to
level_table_count), bind it to a variable (e.g., count) and then assert_eq!(0,
count, "L0 must be empty after major_compact — compaction may not have run");
this ensures failures to fetch the L0 count fail the test rather than being
coerced to zero.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0b978752-95b1-42f5-bebc-a380b250c4d8

📥 Commits

Reviewing files that changed from the base of the PR and between ef55325 and dd0275f.

📒 Files selected for processing (4)

.github/workflows/coordinode-ci.yml
src/compression/mod.rs
src/compression/zstd_pure.rs
tests/zstd_dict_roundtrip.rs

- Fix test postcondition: assert_eq!(Some(0), level_table_count(0)) instead of unwrap_or(0) — properly distinguishes "L0 exists and is empty" from unexpected None (Copilot #54, CodeRabbit #56) - Fix ZstdDictionary::new() doc: handle stores full 64-bit xxh3 hash internally; id() returns lower 32 bits on demand (CodeRabbit #55)

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

… in decode_raw_content_bounded

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

## 🤖 New release * `coordinode-lsm-tree`: 4.3.1 -> 4.4.0 <details><summary>Changelog</summary> <blockquote> ## [4.4.0](v4.3.1...v4.4.0) - 2026-04-09 ### Added - *(compression)* enable dictionary compression in pure Rust backend ([#229](#229)) ### Performance - *(compression)* cache pre-compiled Dictionary across block decompress calls ([#227](#227)) </blockquote> </details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: sw-release-bot[bot] <255865126+sw-release-bot[bot]@users.noreply.github.com>

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread src/compression/mod.rs Outdated

Comment thread src/compression/zstd_pure.rs Outdated

Comment thread tests/zstd_dict_roundtrip.rs Outdated

Comment thread tests/zstd_dict_roundtrip.rs Outdated

polaz requested a review from Copilot April 7, 2026 17:13

Copilot started reviewing on behalf of polaz April 7, 2026 17:14 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Comment thread src/compression/mod.rs Outdated

Comment thread tests/zstd_dict_roundtrip.rs Outdated

polaz requested a review from Copilot April 7, 2026 17:30

Copilot started reviewing on behalf of polaz April 7, 2026 17:31 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Comment thread src/compression/mod.rs Outdated

Comment thread src/compression/zstd_pure.rs Outdated

Comment thread .github/workflows/coordinode-ci.yml Outdated

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread .github/workflows/coordinode-ci.yml Outdated

Comment thread src/compression/mod.rs Outdated

polaz commented Apr 7, 2026

View reviewed changes

Comment thread .github/workflows/coordinode-ci.yml Outdated

polaz requested a review from Copilot April 7, 2026 18:48

Copilot started reviewing on behalf of polaz April 7, 2026 18:49 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Comment thread src/compression/zstd_pure.rs Outdated

Comment thread src/compression/zstd_pure.rs Outdated

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread src/compression/zstd_pure.rs Outdated

refactor(compression): extract DICT_MAGIC to module-level constant

59bd591

polaz requested a review from Copilot April 7, 2026 19:38

Copilot started reviewing on behalf of polaz April 7, 2026 19:39 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread src/compression/zstd_pure.rs

sw-release-bot Bot reviewed Apr 7, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread src/compression/zstd_pure.rs

polaz requested a review from Copilot April 7, 2026 21:25

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Comment thread tests/zstd_dict_roundtrip.rs

Comment thread src/compression/mod.rs Outdated

Comment thread src/compression/zstd_pure.rs Outdated

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread .github/workflows/coordinode-ci.yml Outdated

Comment thread src/compression/mod.rs Outdated

Comment thread src/compression/zstd_pure.rs Outdated

polaz requested a review from Copilot April 8, 2026 22:46

Copilot started reviewing on behalf of polaz April 8, 2026 22:47 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread tests/zstd_dict_roundtrip.rs

polaz requested a review from Copilot April 9, 2026 07:16

Copilot started reviewing on behalf of polaz April 9, 2026 07:17 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Comment thread tests/zstd_dict_roundtrip.rs Outdated

coderabbitai Bot reviewed Apr 9, 2026

View reviewed changes

Comment thread src/compression/mod.rs Outdated

Comment thread tests/zstd_dict_roundtrip.rs

polaz requested a review from Copilot April 9, 2026 07:55

Copilot started reviewing on behalf of polaz April 9, 2026 07:55 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Comment thread src/compression/zstd_pure.rs

Comment thread Cargo.toml

docs(compression): document UptoBytes one-block over-decode behaviour…

eb34b7f

… in decode_raw_content_bounded

polaz requested a review from Copilot April 9, 2026 10:06

Copilot started reviewing on behalf of polaz April 9, 2026 10:07 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

polaz merged commit abd0ee5 into main Apr 9, 2026
23 checks passed

polaz deleted the feat/#218-featcompression-enable-dictionary-compression-in-p branch April 9, 2026 10:16

sw-release-bot Bot mentioned this pull request Apr 9, 2026

chore: release v4.4.0 #228

Merged

This was referenced Apr 9, 2026

Roadmap: coordinode-lsm-tree performance & structured-zstd integration #215

Open

feat(vlog): dictionary compression for blob files #233

Merged

Conversation

polaz commented Apr 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Technical Details

Known Limitations

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

codecov Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

polaz commented Apr 7, 2026

Uh oh!

coderabbitai Bot commented Apr 7, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sw-release-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

polaz commented Apr 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 7, 2026 •

edited

Loading

codecov Bot commented Apr 7, 2026 •

edited

Loading

sw-release-bot Bot left a comment •

edited

Loading