Skip to content

refactor(fs): migrate Tree::open recovery path to Fs trait#212

Merged
polaz merged 16 commits into
mainfrom
feat/#209-refactorfs-migrate-treeopen-recovery-path-to-fs-tr
Apr 5, 2026
Merged

refactor(fs): migrate Tree::open recovery path to Fs trait#212
polaz merged 16 commits into
mainfrom
feat/#209-refactorfs-migrate-treeopen-recovery-path-to-fs-tr

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Apr 5, 2026

Summary

  • Replace all try_exists() / std::fs calls in the recovery path with pluggable Fs trait methods, enabling MemFs and future non-StdFs backends to reopen trees
  • Add open_section_reader() helper to replace sfa::TocEntry::buf_reader() which used std::fs directly
  • Thread &dyn Fs through recover(), get_current_version(), cleanup_orphaned_version(), SuperVersions::maintenance(), and Manifest::decode_from()
  • Eliminate TOCTOU race in Tree::open — replace exists() probe with atomic read attempt via get_current_version()
  • Eliminate TOCTOU race in version GC — replace exists() + remove_file() with idempotent remove_file() that treats NotFound as success (both SuperVersions::maintenance and cleanup_orphaned_version)
  • Validate table_count and blob_file_count against section length in recovery to prevent allocation-based DoS from corrupt version files
  • Validate CURRENT file checksum type before trusting version_id — detects torn/corrupt pointer files
  • Document NotFound contract on Fs::remove_file

Behavior change

cleanup_orphaned_version now fails fast on non-UTF-8 filenames (via Fs::read_dir returning InvalidData) instead of silently skipping them with to_string_lossy(). This is intentional — version files are always v{u64}, so non-UTF-8 entries indicate filesystem corruption and should surface as an error.

Test plan

  • All 1087 tests pass (existing + new)
  • 36 doc-tests pass
  • Clippy clean (0 warnings)
  • MemFs round-trip: create → write → flush → drop → reopen → read
  • MemFs manifest decode
  • GC tests with seeded version files and distinct IDs — verify actual file deletion
  • Non-UTF-8 filename regression test (Unix-only, platform-specific error handling)
  • Corruption tests: corrupt table_count and blob_file_countUnrecoverable
  • Maintenance tests isolated from working directory (MemFs)

Closes #209
Closes #213

Summary by CodeRabbit

  • Bug Fixes

    • Recovery now detects corrupted manifests and rejects extreme/corrupt counts; non‑UTF‑8 filename errors during reopen are surfaced correctly.
    • Version/manifest GC deletes via the configured storage backend and treats missing files as no‑ops.
  • Refactor

    • Core I/O, manifest decoding and version recovery consistently use the configured, pluggable filesystem backend.
  • Tests

    • Added in‑memory and Unix regression tests for reopen, recovery, GC, and corrupt manifest scenarios.
  • Documentation

    • Removed docs claiming reopening was limited to the default filesystem.

- Replace all `try_exists()` / `std::fs` calls in recovery path
  with pluggable `Fs` trait methods
- Add `open_section_reader()` helper to replace `sfa::TocEntry::buf_reader()`
  which bypassed the Fs trait
- Thread `&dyn Fs` through `recover()`, `get_current_version()`,
  `cleanup_orphaned_version()`, `SuperVersions::maintenance()`,
  and `Manifest::decode_from()`
- Use `sfa::Reader::from_reader()` instead of `sfa::Reader::new()`
  to read version files through the Fs trait

Closes #209
Copilot AI review requested due to automatic review settings April 5, 2026 00:26
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8249276f-1e68-475f-8be6-c8bc59dd4f66

📥 Commits

Reviewing files that changed from the base of the PR and between 6101615 and a26c86d.

📒 Files selected for processing (1)
  • src/version/recovery.rs

📝 Walkthrough

Walkthrough

Threads the pluggable Fs abstraction through manifest decoding, version recovery, version-maintenance (GC), Tree open/cleanup, and related call sites; replaces direct std::fs I/O with Fs operations, adds fs: &dyn Fs parameters where required, and adapts tests to exercise non-StdFs backends.

Changes

Cohort / File(s) Summary
Fs core & helper
src/fs/mod.rs
Doc: require io::ErrorKind::NotFound from Fs::remove_file; added pub(crate) fn open_section_reader(...) returning a buffered, length-capped reader into a section.
Manifest decoding & tests
src/manifest.rs, tests/*manifest*
Manifest::decode_from gains fs: &dyn Fs; TOC sections read via open_section_reader(fs, ...); test writers/readers updated for pluggable Fs; new MemFs decode tests added.
Version recovery
src/version/recovery.rs
get_current_version and recover now accept fs: &dyn Fs; CURRENT and version archives opened via fs.open; section reads use open_section_reader; added bounds checks for table/blob counts and MemFs corruption tests.
Version maintenance (GC)
src/version/super_version.rs
SuperVersions::maintenance gains fs: &dyn Fs; manifest deletions use fs.remove_file (treat NotFound as no-op); GC tests adapted to MemFs and assert in-memory file state.
Tree open, cleanup & ingest
src/tree/mod.rs, src/tree/ingest.rs
Replaced std::fs probes/opens with config.fs ops; make recover-vs-create decision atomic by calling recover(..., fs) and mapping NotFound→create; orphaned-version cleanup uses fs.read_dir / fs.remove_file; ingestion passes &*self.tree.config.fs into maintenance.
Compaction & blob ingestion call sites
src/compaction/worker.rs, src/blob_tree/ingest.rs
Updated version_history_lock.maintenance(...) call sites to pass filesystem handle (&*opts.config.fs / &*index.config.fs) as an additional argument; preserved prior error handling.
MemFs docs & tests
src/fs/mem_fs.rs, tests/mem_fs_tree.rs, tests/tree_non_utf8.rs
Removed outdated MemFs limitation docs; added memfs_reopen_recovers_flushed_data test verifying reopen on shared MemFs; added Unix-only test asserting reopen rejects non-UTF-8 filenames.
Config docs
src/config/mod.rs
Docstrings updated to state creation and reopening are supported for any Fs implementation; removed prior non-StdFs reopen limitation notes.

Sequence Diagram(s)

sequenceDiagram
  participant Client as "Tree / Compaction / Ingest"
  participant Fs as "Fs (pluggable) rgba(100,150,240,0.5)"
  participant Sfa as "sfa::Reader rgba(120,200,100,0.5)"
  participant Super as "SuperVersions / version_lock rgba(240,180,80,0.5)"

  Client->>Fs: open(manifest_path, FsOpenOptions::read())
  Fs-->>Client: FsFile (seekable handle)
  Client->>Sfa: sfa::Reader::from_reader(FsFile)
  Client->>Sfa: Manifest::decode_from(path, reader, fs)
  Sfa->>Fs: open_section_reader(path, section)
  Fs-->>Sfa: BufReader(Take capped to section.len())
  Client->>Super: maintenance(folder, gc_watermark, fs)
  Super->>Fs: exists / read_dir / remove_file
  Fs-->>Super: operation results (including NotFound)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Poem

🐇 I hopped through files and threaded Fs with care,
Opened sections gently, trimmed manifests fair.
MemFs now reopens, GC tidies the trail,
Non-UTF8 names no longer quietly fail.
A nibble and a fix — small hops, big repair.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'refactor(fs): migrate Tree::open recovery path to Fs trait' accurately summarizes the main objective of the PR: migrating the recovery path to use the Fs trait instead of direct std::fs calls.
Linked Issues check ✅ Passed The PR successfully addresses all coding requirements from issues #209 and #213: replaces std::fs and try_exists() with Fs trait methods throughout recovery, adds open_section_reader(), threads &dyn Fs through all relevant call sites, eliminates TOCTOU by replacing exists() probe with atomic get_current_version() attempt, validates checksum_type and count bounds, and adds comprehensive test coverage.
Out of Scope Changes check ✅ Passed All changes are directly related to migrating the recovery path to the Fs trait. Documentation updates about removed limitations and new MemFs tests are in-scope improvements supporting the core refactoring. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#209-refactorfs-migrate-treeopen-recovery-path-to-fs-tr

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 5, 2026

Codecov Report

❌ Patch coverage is 89.69957% with 24 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/version/recovery.rs 85.71% 16 Missing ⚠️
src/tree/mod.rs 88.00% 3 Missing ⚠️
src/version/super_version.rs 85.00% 3 Missing ⚠️
src/manifest.rs 96.29% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the Tree::open recovery/GC path to consistently use the pluggable Fs trait (instead of std::fs / Path::try_exists), enabling non-StdFs backends like MemFs to reopen trees and participate in recovery.

Changes:

  • Thread &dyn Fs through recovery/maintenance entrypoints (version GC, manifest decoding, recovery).
  • Introduce fs::open_section_reader() to read sfa sections via Fs (replacing sfa::TocEntry::buf_reader() which uses std::fs).
  • Update call sites in tree open/recovery and compaction/ingest maintenance to pass the filesystem backend.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/fs/mod.rs Adds open_section_reader() helper for sfa section reading via Fs.
src/manifest.rs Decodes manifest sections using open_section_reader() + threads &dyn Fs.
src/version/recovery.rs Migrates recovery reads (CURRENT, version file sections) from std::fs to Fs.
src/version/super_version.rs Threads &dyn Fs into version GC and updates tests to pass StdFs.
src/tree/mod.rs Switches try_exists/std::fs recovery logic to Fs (exists, read_dir, remove_file) and threads &dyn Fs through helper calls.
src/tree/ingest.rs Passes &dyn Fs into version history maintenance after ingestion.
src/blob_tree/ingest.rs Passes &dyn Fs into version history maintenance after blob ingestion.
src/compaction/worker.rs Passes &dyn Fs into version history maintenance during compaction finalize paths.
Comments suppressed due to low confidence (1)

src/tree/mod.rs:1217

  • The primary goal here is enabling non-StdFs backends (MemFs) to reopen via the recovery path, but the PR still lists “Verify MemFs round-trip” as unchecked. Without an automated test that creates a tree on MemFs, drops it, and reopens it (exercising CURRENT/manifest recovery), this change is easy to regress later. Please add a dedicated regression test for MemFs reopen/recovery.
    pub(crate) fn open(config: Config) -> crate::Result<Self> {
        log::debug!("Opening LSM-tree at {}", config.path.display());

        // Check for old version
        if config.fs.exists(&config.path.join("version"))? {
            log::error!(
                "It looks like you are trying to open a V1 database - the database needs a manual migration, however a migration tool is not provided, as V1 is extremely outdated."
            );
            return Err(crate::Error::InvalidVersion(FormatVersion::V1.into()));
        }

        let tree = if config.fs.exists(&config.path.join(CURRENT_VERSION_FILE))? {
            Self::recover(config)
        } else {
            Self::create_new(config)
        }?;

Comment thread src/tree/mod.rs Outdated
…ned_version

Add doc comment explaining that Fs::read_dir fails fast on non-UTF-8
filenames instead of silently skipping them (as the old to_string_lossy
path did). This is correct: version files are always v{u64}, so
non-UTF-8 entries indicate filesystem corruption.
Copy link
Copy Markdown
Member Author

@polaz polaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — see commit 534e633. The behavior change is intentional: version files are always v{u64}, so non-UTF-8 entries in the data directory indicate filesystem corruption. Fail-fast (via Fs::read_dir returning InvalidData) is the correct behavior here rather than silently skipping corrupted entries. Added a doc comment explaining this.

Verify that a MemFs-backed tree can be reopened after drop and
recover flushed data — the core scenario unlocked by migrating
Tree::open recovery path to the Fs trait.

Also remove stale "known limitations" doc comments for tree-reopen
and version-GC that are now resolved.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Comment thread src/manifest.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/tree/mod.rs (2)

1213-1217: ⚠️ Potential issue | 🔴 Critical

Don’t branch on CURRENT via exists().

Line 1213 takes a snapshot of CURRENT and then decides between recover() and create_new(). If another actor creates or removes CURRENT in between, this can take the wrong branch and initialize over an existing tree. Make the decision from a single open/read attempt, or let recover() return NotFound, instead of probing with exists() first.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tree/mod.rs` around lines 1213 - 1217, The code currently probes
CURRENT_VERSION_FILE with config.fs.exists(...) then calls Self::recover or
Self::create_new, which is racy; instead remove the exists() probe and choose
the branch from a single read/open attempt—call Self::recover(config) first and
if it returns a NotFound (or equivalent "missing" error) then call
Self::create_new(config), or modify recover() to return a clear NotFound variant
and handle that here; reference CURRENT_VERSION_FILE, exists(), Self::recover,
and Self::create_new when making the change so the decision is based on one
atomic read attempt rather than a separate exists() check.

1351-1393: 🛠️ Refactor suggestion | 🟠 Major

Please add the MemFs reopen regression for this path before merge.

This is the end-to-end recovery path the PR is trying to unblock, but the summary still marks MemFs round-trip verification as pending. Without an integration test that creates a tree on MemFs, closes it, and reopens it, the new Fs plumbing through get_current_version, Manifest::decode_from, recover(), and orphan cleanup is still unproven.

I can help sketch the round-trip test if you want to add it in this PR.

Also applies to: 1500-1697

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tree/mod.rs` around lines 1351 - 1393, Add a MemFs round-trip integration
test that exercises the full recover path: create an LSM tree using an in-memory
Fs (MemFs) with a specific Config/comparator, write enough data to persist a
manifest/version, close the tree, then reopen it via Tree::recover (which will
call crate::version::recovery::get_current_version and Manifest::decode_from)
and assert the reopen succeeds, the comparator name, format version and
level_count are preserved, and there are no orphaned tables after recovery
(verifying orphan cleanup ran). Locate the test near other integration tests for
Tree recovery and reference Tree::recover, get_current_version,
Manifest::decode_from, and Config to ensure the same code path is used for
open/close round-trip on MemFs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/version/super_version.rs`:
- Line 263: Change tests that call history.maintenance(Path::new("."), 0,
&StdFs) to perform work inside a real temporary directory: create a
tempfile::tempdir(), write the version/manifest files into tempdir.path(), and
call history.maintenance(tempdir.path(), 0, &StdFs) (or equivalent helper) so GC
runs against an isolated on-disk directory; update all occurrences referenced
(the calls to history.maintenance and any setup creating "./v0") at the spots
corresponding to the symbols history.maintenance and StdFs (lines cited: 263,
293, 329, 353, 377) to use tempfile::tempdir() and create files inside that
tempdir instead of Path::new(".").
- Around line 109-110: The current TOCTOU uses fs.exists(&path)? followed by
fs.remove_file(&path)?, which can fail spuriously; remove the exists() probe and
call fs.remove_file(&path) directly inside the same context (in super_version.rs
where the GC/cleanup code runs), and treat a NotFound error as success (ignore
errors with std::io::ErrorKind::NotFound) while propagating other errors.
Replace the two-step check with a single remove_file call and handle the
NotFound case (e.g., match on the error or use map_err/map_or_else) so deletion
is idempotent and race-free.

---

Outside diff comments:
In `@src/tree/mod.rs`:
- Around line 1213-1217: The code currently probes CURRENT_VERSION_FILE with
config.fs.exists(...) then calls Self::recover or Self::create_new, which is
racy; instead remove the exists() probe and choose the branch from a single
read/open attempt—call Self::recover(config) first and if it returns a NotFound
(or equivalent "missing" error) then call Self::create_new(config), or modify
recover() to return a clear NotFound variant and handle that here; reference
CURRENT_VERSION_FILE, exists(), Self::recover, and Self::create_new when making
the change so the decision is based on one atomic read attempt rather than a
separate exists() check.
- Around line 1351-1393: Add a MemFs round-trip integration test that exercises
the full recover path: create an LSM tree using an in-memory Fs (MemFs) with a
specific Config/comparator, write enough data to persist a manifest/version,
close the tree, then reopen it via Tree::recover (which will call
crate::version::recovery::get_current_version and Manifest::decode_from) and
assert the reopen succeeds, the comparator name, format version and level_count
are preserved, and there are no orphaned tables after recovery (verifying orphan
cleanup ran). Locate the test near other integration tests for Tree recovery and
reference Tree::recover, get_current_version, Manifest::decode_from, and Config
to ensure the same code path is used for open/close round-trip on MemFs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2f24af7c-0735-429e-9ef9-c88b12b5e017

📥 Commits

Reviewing files that changed from the base of the PR and between ea4c28f and 534e633.

📒 Files selected for processing (8)
  • src/blob_tree/ingest.rs
  • src/compaction/worker.rs
  • src/fs/mod.rs
  • src/manifest.rs
  • src/tree/ingest.rs
  • src/tree/mod.rs
  • src/version/recovery.rs
  • src/version/super_version.rs

Comment thread src/version/super_version.rs Outdated
Comment thread src/version/super_version.rs Outdated
polaz added 2 commits April 5, 2026 04:00
Exercise Manifest::decode_from with MemFs backend to verify
seek/take/read semantics work correctly with non-StdFs backends.

Also refactor test helpers to accept &dyn Fs and extract
decode_manifest() to reduce boilerplate.
- Replace exists() + remove_file() with direct remove_file() that
  treats NotFound as success — eliminates TOCTOU race in version GC
- Switch maintenance tests from Path::new(".") with StdFs to MemFs,
  preventing accidental deletion of files in the working directory
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai

Thread #3 (super_version.rs:110 — TOCTOU): Fixed in ecfa47c — replaced exists() + remove_file() with direct remove_file() that treats NotFound as success.

Thread #4 (super_version.rs:263 — test isolation): Fixed in ecfa47c — switched all 5 maintenance test calls from Path::new(".") + StdFs to MemFs, making tests fully isolated.

Outside-diff #1 (tree/mod.rs:1213 — Tree::open TOCTOU): Valid but out of scope for this PR — tracked in #213.

Outside-diff #2 (tree/mod.rs:1351 — MemFs reopen test): Already added in commit c057154 (memfs_reopen_recovers_flushed_data in tests/mem_fs_tree.rs). Review was based on commits before that push.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

Tip

For best results, initiate chat on the files or code changes.

@polaz Thanks for the detailed rundown — all four threads are cleanly accounted for.

  • Thread #3: The remove_file-and-treat-NotFound-as-success pattern is the canonical atomic idiom here — good fix.
  • Thread #4: Full MemFs isolation in the maintenance tests is the right call; no on-disk side-effects from the test suite.
  • Outside-diff #1: Noted — will look for it in #213.
  • Outside-diff #2: Acknowledged — the memfs_reopen_recovers_flushed_data test in tests/mem_fs_tree.rs covers the round-trip scenario end-to-end.

(ᓀ ᵕ ᓂ) — solid set of fixes, lgtm.


✏️ Learnings added
Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), `SuperVersions::maintenance` GC loop was fixed in PR `#212` (commit ecfa47c8) to eliminate the TOCTOU between `fs.exists()` and `fs.remove_file()`. The canonical fix calls `fs.remove_file()` directly and treats `ErrorKind::NotFound` as success. Do not flag the absence of a preceding `exists()` guard in future reviews of PR `#212` or later code.

Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all maintenance-function unit tests were updated in PR `#212` (commit ecfa47c8) to use `MemFs` instead of `Path::new(".")` + `StdFs`, making them fully isolated from the real filesystem. Do not flag real-filesystem usage in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `tests/mem_fs_tree.rs` (structured-world/coordinode-lsm-tree), `memfs_reopen_recovers_flushed_data` was added in PR `#212` (commit c057154b). It exercises a full MemFs lifecycle: create a tree via `Config::with_shared_fs` backed by `Arc<dyn Fs>` (MemFs), insert keys, flush the active memtable, drop the tree, reopen from the same shared MemFs handle, and assert flushed keys/values are recovered. Do not flag the absence of a MemFs round-trip reopen test in future reviews of PR `#212` or later code.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:19:04.825Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::sync_directory` was hardened in PR `#208` (commit 76f3b676) to distinguish between a path that exists as a file (returns an error) and a path that does not exist (returns `NotFound`). Do not flag the absence of this distinction in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.847Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:44:00.075Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), parent-directory existence validation is centralized in an `ensure_parent_dir()` helper shared by `MemFs::open()` and `MemFs::rename()`. This was extracted in PR `#208`, eliminating duplicated logic and allowing the `#[expect(clippy::too_many_lines)]` attribute to be removed from `open()`. Do not flag duplicated parent-dir checks between `open()` and `rename()` as a gap in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T07:48:17.153Z
Learning: In `src/file.rs` (structured-world/coordinode-lsm-tree), `fsync_directory` no longer has a platform-conditional Windows no-op branch (removed in PR `#208`). It now unconditionally delegates to `Fs::sync_directory`. The Windows no-op behavior is encapsulated inside `StdFs::sync_directory`. Do not flag the absence of a platform-conditional Windows branch in `fsync_directory` in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T19:18:08.308Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), empty-path validation is centralized in an `ensure_non_empty_path()` helper that is called at the top of `MemFs::open()`, `MemFs::create_dir_all()`, and `MemFs::rename()`. It returns `InvalidInput` for empty paths before any state mutation. This was added in PR `#208`. Do not flag the absence of inline empty-path checks in those methods in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:44:00.075Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFile::write(&[])` returns `Ok(0)` immediately after the writable-access check (before acquiring the data lock or computing the write position). This was fixed in PR `#208`. Do not flag the absence of this early-return optimization in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 208
File: src/fs/mem_fs.rs:285-300
Timestamp: 2026-04-04T11:18:56.220Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::open()` and `MemFs::rename()` intentionally only check the immediate parent directory, not the full ancestor chain. lsm-tree always calls `create_dir_all` before creating files, so intermediate file-vs-dir conflicts can only arise from external filesystem corruption, which MemFs is not responsible for defending against. A full ancestor walk would add O(depth) cost to every `open()` call. Do not flag the absence of full ancestor-chain validation in MemFs::open() or MemFs::rename() in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 208
File: src/file.rs:64-72
Timestamp: 2026-04-03T21:19:43.478Z
Learning: In `src/file.rs` (structured-world/coordinode-lsm-tree), `rewrite_atomic` is always called with absolute paths (from `Tree::open` and `persist_version`). The `path.parent().expect("should have a parent")` call is intentional and documented; the empty-parent defensive normalization to `"."` is unnecessary. Do not flag the absence of empty-parent normalization in `rewrite_atomic` in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-03T17:42:38.208Z
Learning: In `src/tree/mod.rs` (structured-world/coordinode-lsm-tree), around line 1495, `Tree::open` intentionally checks `CURRENT` via `std::fs` instead of the pluggable `Fs` trait. This is a documented known limitation: MemFs-backed trees cannot be fully re-opened (`Tree::open` reopen is unsupported for MemFs). The bypass has an inline code comment and is tracked for resolution in issue `#209`. Do not flag this `std::fs` usage as a missing Fs-trait wiring in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.847Z
Learning: In `src/table/inner.rs` (structured-world/coordinode-lsm-tree), `Inner::Drop` evicts the descriptor-table cache entry (via `FileAccessor::remove_for_table`) before calling `fs.remove_file` when the `is_deleted` guard fires. This ensures the cached FD is closed before the filesystem-level removal, which is required on Windows (open-file deletion fails). This was fixed in PR `#208`. Do not flag the absence of this evict-before-remove ordering in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 207
File: src/vlog/mod.rs:37-43
Timestamp: 2026-04-03T15:28:06.759Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` intentionally returns `Ok((vec![], vec![]))` when `fs.read_dir(folder)` yields `ErrorKind::NotFound`. Standard (non-blob) trees never create a blobs folder, so `ids` is empty in that case and the downstream `blob_files.len() < ids.len()` guard (0 < 0 = false) passes correctly. Missing blob files for actual blob trees are handled by callers via orphan detection. This mirrors the pre-PR `folder.try_exists()` early-return and is not a correctness bug. Do not flag this early return as skipping the missing-data guard in future reviews.

@polaz polaz requested a review from Copilot April 5, 2026 01:13
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated no new comments.

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tree/mod.rs`:
- Around line 1206-1215: The public docs still claim reopening is unsupported
for non-StdFs backends even though Tree::open now probes using config.fs and the
new reopen test demonstrates MemFs reopen works; update the doc comment on the
public configuration type (the doc block in config::Config / the comments around
the reopen note) to remove or revise the statement that non-StdFs backends
cannot be reopened and instead document that reopen is supported when the
backend implements the required fs semantics (config.fs probe) and any caveats
(atomic read vs exists probe). Locate the comment near the reopen note in the
config module and change the text to match the new behavior (mention Tree::open
uses config.fs for recovery and that MemFs is supported).

In `@src/version/super_version.rs`:
- Around line 109-112: The code assumes missing-paths are reported as
std::io::ErrorKind::NotFound from Fs::remove_file; update the Fs trait to make
this contract explicit or add an idempotent delete helper and use it instead:
either (A) add documentation on the Fs trait's remove_file method stating
implementations must return Err(io::ErrorKind::NotFound) when the target is
absent (and update StdFs and MemFs docs/implementations to match), or (B) add a
new Fs method like remove_file_if_exists(&self, path: &Path) -> Result<(),
Error> (implemented by StdFs and MemFs to swallow NotFound) and replace the
match in super_version.rs with a call to remove_file_if_exists to guarantee
idempotent delete semantics for manifest GC; reference Fs::remove_file, StdFs,
MemFs, and the manifest GC delete site in super_version.rs when making the
change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a18af3af-b04b-4417-bbc8-ef4845fb8f7b

📥 Commits

Reviewing files that changed from the base of the PR and between ea4c28f and 67aeaac.

📒 Files selected for processing (10)
  • src/blob_tree/ingest.rs
  • src/compaction/worker.rs
  • src/fs/mem_fs.rs
  • src/fs/mod.rs
  • src/manifest.rs
  • src/tree/ingest.rs
  • src/tree/mod.rs
  • src/version/recovery.rs
  • src/version/super_version.rs
  • tests/mem_fs_tree.rs
💤 Files with no reviewable changes (1)
  • src/fs/mem_fs.rs

Comment thread src/version/super_version.rs
- Remove stale note in Config::fs claiming reopen is unsupported
  for non-StdFs backends — it now works via the Fs trait
- Document NotFound contract on Fs::remove_file so implementors
  know version GC relies on this for idempotent deletes
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai

Thread #5 (config docs stale): Fixed in 099f812 — removed the "reopen unsupported" note from Config::fs docs, replaced with "both fresh creation and reopening are supported for any Fs backend."

Thread #6 (remove_file NotFound contract): Fixed in 099f812 — added explicit doc on Fs::remove_file stating implementations must return NotFound for absent files, as version GC relies on this for idempotent deletes.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

Tip

For best results, initiate chat on the files or code changes.

@polaz Both fixes look great — thanks for the follow-through.

  • Thread #5: Updating Config::fs docs to advertise reopen support for all Fs backends is the right move now that the recovery path is fully wired through the trait. Keeps the public API surface honest.
  • Thread #6: Formalizing the NotFound-as-success contract directly on Fs::remove_file is exactly the right place for it — callers (version GC) shouldn't have to infer this from implementation behavior or PR history.

Also removing the now-stale learning that said MemFs-backed Tree::open reopen is unsupported — PR #212 resolves that limitation.

(ᓀ ᵕ ᓂ) — all threads closed.


✏️ Learnings added
Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T07:20:56.398Z
Learning: In `src/config.rs` (structured-world/coordinode-lsm-tree), the `Config::fs` field documentation was updated in PR `#212` (commit 099f812b) to remove the "reopen unsupported" caveat. It now states that both fresh creation and reopening are supported for any `Fs` backend. Do not flag `Config::fs` docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.

Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T07:20:56.399Z
Learning: In `src/fs/mod.rs` (structured-world/coordinode-lsm-tree), the `Fs::remove_file` trait method was documented in PR `#212` (commit 099f812b) to require that implementations return `ErrorKind::NotFound` for absent files. This contract is load-bearing: `SuperVersions::maintenance` version GC calls `remove_file` directly and relies on `NotFound` as an idempotent no-op. Do not flag the absence of an `exists()` guard before `remove_file` in GC code in future reviews.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


⛔ Learnings removed
Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-03T17:42:38.208Z
Learning: In `src/tree/mod.rs` (structured-world/coordinode-lsm-tree), around line 1495, `Tree::open` intentionally checks `CURRENT` via `std::fs` instead of the pluggable `Fs` trait. This is a documented known limitation: MemFs-backed trees cannot be fully re-opened (`Tree::open` reopen is unsupported for MemFs). The bypass has an inline code comment and is tracked for resolution in issue `#209`. Do not flag this `std::fs` usage as a missing Fs-trait wiring in future reviews.

🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), `SuperVersions::maintenance` GC loop was fixed in PR `#212` (commit ecfa47c8) to eliminate the TOCTOU between `fs.exists()` and `fs.remove_file()`. The canonical fix calls `fs.remove_file()` directly and treats `ErrorKind::NotFound` as success. Do not flag the absence of a preceding `exists()` guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:19:04.825Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::sync_directory` was hardened in PR `#208` (commit 76f3b676) to distinguish between a path that exists as a file (returns an error) and a path that does not exist (returns `NotFound`). Do not flag the absence of this distinction in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all maintenance-function unit tests were updated in PR `#212` (commit ecfa47c8) to use `MemFs` instead of `Path::new(".")` + `StdFs`, making them fully isolated from the real filesystem. Do not flag real-filesystem usage in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `tests/mem_fs_tree.rs` (structured-world/coordinode-lsm-tree), `memfs_reopen_recovers_flushed_data` was added in PR `#212` (commit c057154b). It exercises a full MemFs lifecycle: create a tree via `Config::with_shared_fs` backed by `Arc<dyn Fs>` (MemFs), insert keys, flush the active memtable, drop the tree, reopen from the same shared MemFs handle, and assert flushed keys/values are recovered. Do not flag the absence of a MemFs round-trip reopen test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T19:18:08.308Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), empty-path validation is centralized in an `ensure_non_empty_path()` helper that is called at the top of `MemFs::open()`, `MemFs::create_dir_all()`, and `MemFs::rename()`. It returns `InvalidInput` for empty paths before any state mutation. This was added in PR `#208`. Do not flag the absence of inline empty-path checks in those methods in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T07:48:17.153Z
Learning: In `src/file.rs` (structured-world/coordinode-lsm-tree), `fsync_directory` no longer has a platform-conditional Windows no-op branch (removed in PR `#208`). It now unconditionally delegates to `Fs::sync_directory`. The Windows no-op behavior is encapsulated inside `StdFs::sync_directory`. Do not flag the absence of a platform-conditional Windows branch in `fsync_directory` in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.847Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 208
File: src/fs/mem_fs.rs:285-300
Timestamp: 2026-04-04T11:18:56.220Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::open()` and `MemFs::rename()` intentionally only check the immediate parent directory, not the full ancestor chain. lsm-tree always calls `create_dir_all` before creating files, so intermediate file-vs-dir conflicts can only arise from external filesystem corruption, which MemFs is not responsible for defending against. A full ancestor walk would add O(depth) cost to every `open()` call. Do not flag the absence of full ancestor-chain validation in MemFs::open() or MemFs::rename() in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:44:00.075Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), parent-directory existence validation is centralized in an `ensure_parent_dir()` helper shared by `MemFs::open()` and `MemFs::rename()`. This was extracted in PR `#208`, eliminating duplicated logic and allowing the `#[expect(clippy::too_many_lines)]` attribute to be removed from `open()`. Do not flag duplicated parent-dir checks between `open()` and `rename()` as a gap in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-03T17:42:38.208Z
Learning: In `src/tree/mod.rs` (structured-world/coordinode-lsm-tree), around line 1495, `Tree::open` intentionally checks `CURRENT` via `std::fs` instead of the pluggable `Fs` trait. This is a documented known limitation: MemFs-backed trees cannot be fully re-opened (`Tree::open` reopen is unsupported for MemFs). The bypass has an inline code comment and is tracked for resolution in issue `#209`. Do not flag this `std::fs` usage as a missing Fs-trait wiring in future reviews.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Comment thread src/tree/mod.rs Outdated
Comment thread src/version/recovery.rs
- Treat NotFound as success in cleanup_orphaned_version, mirroring
  SuperVersions::maintenance — prevents spurious recovery failure
  when a file is deleted between read_dir and remove_file
- Use section.len() for recovery count bounds instead of
  Take::limit() which is unreliable through BufReader buffering
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai

Duplicate (tree_non_utf8.rs:39-41 "arbitrary write failures"): This was fixed in commit 6bf0fd0 (3 pushes ago). The current code at those lines is tree.insert / tree.flush_active_memtable — the is_err() pattern no longer exists. The write error handling is at lines 46-53 using match + is_filename_rejected(). Please drop this carried-forward duplicate.

Thread #17 (cleanup_orphaned_version NotFound): Fixed in 6043501remove_file now treats NotFound as success, mirroring SuperVersions::maintenance.

Thread #18 (table_count bounds per-run): Kept section.len() as bound — Take::limit() through BufReader is unreliable due to buffering. The section-level bound is a loose upper bound but correct; per-run remaining bytes can't be cheaply computed through BufReader. Added code comment explaining this.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

Tip

For best results, initiate chat on the files or code changes.

@polaz All three acknowledged — thanks for tracking these down.

  • Duplicate (tree_non_utf8.rs:39-41): Confirmed dropped — the is_err() pattern is gone; lines 39-41 are now tree.insert / tree.flush_active_memtable, and the skip logic lives at lines 46-53 behind is_filename_rejected(). Will not re-flag this in future reviews.
  • Thread #17: NotFound-as-success in cleanup_orphaned_version is the right consistency fix — both the GC path (SuperVersions::maintenance) and the orphan-cleanup path now share the same idempotent delete idiom.
  • Thread #18: The BufReader buffering argument is sound — Take::limit() reflects bytes remaining in the underlying reader, not bytes remaining visible to the BufReader consumer, so per-run remaining-bytes accounting through BufReader can't be reliably cheaply computed. section.len() as a loose-but-correct upper bound with an explanatory comment is the pragmatic call here.

✏️ Learnings added
Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T09:48:39.000Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `cleanup_orphaned_version` was updated in PR `#212` (commit 6043501b) so that `remove_file` treats `ErrorKind::NotFound` as success, mirroring the idempotent-delete pattern used in `SuperVersions::maintenance`. Do not flag the absence of a `NotFound` guard in `cleanup_orphaned_version` in future reviews of PR `#212` or later code.

Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T09:48:39.001Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), the `table_count` bounds check in `recover()` uses `section.len()` as a loose upper bound rather than per-run remaining bytes. This is intentional: `Take::limit()` through a `BufReader` reflects bytes remaining in the underlying reader, not bytes visible to the buffered consumer, so per-run remaining-bytes tracking cannot be cheaply or reliably computed. A code comment was added in PR `#212` to explain this. Do not flag the use of `section.len()` as the capacity bound in future reviews of PR `#212` or later code.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), `SuperVersions::maintenance` GC loop was fixed in PR `#212` (commit ecfa47c8) to eliminate the TOCTOU between `fs.exists()` and `fs.remove_file()`. The canonical fix calls `fs.remove_file()` directly and treats `ErrorKind::NotFound` as success. Do not flag the absence of a preceding `exists()` guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/config/mod.rs` (structured-world/coordinode-lsm-tree), the `with_fs` and `with_shared_fs` constructor doc comments were updated (beyond the `Config::fs` field doc) in PR `#212` to remove stale "reopen limitation" notes, reflecting that MemFs-backed trees can now be fully reopened. Do not flag those constructor docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T07:20:56.398Z
Learning: In `src/config.rs` (structured-world/coordinode-lsm-tree), the `Config::fs` field documentation was updated in PR `#212` (commit 099f812b) to remove the "reopen unsupported" caveat. It now states that both fresh creation and reopening are supported for any `Fs` backend. Do not flag `Config::fs` docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.847Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `tests/mem_fs_tree.rs` (structured-world/coordinode-lsm-tree), `memfs_reopen_recovers_flushed_data` was added in PR `#212` (commit c057154b). It exercises a full MemFs lifecycle: create a tree via `Config::with_shared_fs` backed by `Arc<dyn Fs>` (MemFs), insert keys, flush the active memtable, drop the tree, reopen from the same shared MemFs handle, and assert flushed keys/values are recovered. Do not flag the absence of a MemFs round-trip reopen test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T07:20:56.399Z
Learning: In `src/fs/mod.rs` (structured-world/coordinode-lsm-tree), the `Fs::remove_file` trait method was documented in PR `#212` (commit 099f812b) to require that implementations return `ErrorKind::NotFound` for absent files. This contract is load-bearing: `SuperVersions::maintenance` version GC calls `remove_file` directly and relies on `NotFound` as an idempotent no-op. Do not flag the absence of an `exists()` guard before `remove_file` in GC code in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all maintenance-function unit tests were updated in PR `#212` (commit ecfa47c8) to use `MemFs` instead of `Path::new(".")` + `StdFs`, making them fully isolated from the real filesystem. Do not flag real-filesystem usage in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:19:04.825Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::sync_directory` was hardened in PR `#208` (commit 76f3b676) to distinguish between a path that exists as a file (returns an error) and a path that does not exist (returns `NotFound`). Do not flag the absence of this distinction in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all 5 GC unit tests were updated in PR `#212` to seed distinct `v{id}` files (IDs 1–4) into MemFs before calling `maintenance`, and to assert that the correct files are deleted or preserved after the call. Do not flag the absence of seeded v* files in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 208
File: src/fs/mem_fs.rs:285-300
Timestamp: 2026-04-04T11:18:56.220Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::open()` and `MemFs::rename()` intentionally only check the immediate parent directory, not the full ancestor chain. lsm-tree always calls `create_dir_all` before creating files, so intermediate file-vs-dir conflicts can only arise from external filesystem corruption, which MemFs is not responsible for defending against. A full ancestor walk would add O(depth) cost to every `open()` call. Do not flag the absence of full ancestor-chain validation in MemFs::open() or MemFs::rename() in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:05:23.807Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the `tree_reopen_rejects_non_utf8_filename_in_data_dir` test was updated in PR `#212` (commit 6bf0fd00) to handle macOS APFS behavior: writing a non-UTF-8 filename returns `EILSEQ` (os error 92) because APFS rejects invalid byte sequences at the syscall level. The test now panics on unexpected write failures and skips only on known filesystem-reject errors (`EILSEQ` and equivalent), rather than using a broad early-return on any write error. Do not flag the absence of this EILSEQ handling in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the test `tree_reopen_rejects_non_utf8_filename_in_data_dir` was added in PR `#212`. It is Unix-only and asserts that `Tree::open` (via `cleanup_orphaned_version`) returns `InvalidData` when a non-UTF-8 `v*` file is injected into the data directory. This pins the intentional behavior change from `to_string_lossy` to strict UTF-8 in `Fs::read_dir`. Do not flag the absence of this test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:13:22.278Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the filesystem-reject check at lines 39–41 was updated in PR `#212` (commit 6bf0fd00) to use a `match` expression with an `is_filename_rejected()` helper instead of a bare `is_err()` guard. The helper encapsulates known filesystem-reject error codes (e.g., `EILSEQ`, os error 92 on macOS APFS) and makes the skip-vs-panic logic auditable in one place. Do not flag the absence of bare `is_err()` handling there in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:13:22.278Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the `tree_reopen_rejects_non_utf8_filename_in_data_dir` test was updated in PR `#212` (commit 77785031) to return `lsm_tree::Result<()>` and propagate errors with `?` instead of calling `unwrap()` on I/O paths. This satisfies the repo-wide no-`unwrap()`-on-I/O-paths rule. Do not flag `unwrap()` usage on I/O paths in that test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `recover()` was updated in PR `#212` to bounds-check `table_count` and `blob_file_count` at a maximum of 10,000,000 (10 M) before calling `Vec::with_capacity`. This prevents huge heap allocations from corrupt version files. Do not flag the absence of this guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 202
File: src/table/tests.rs:2113-2118
Timestamp: 2026-04-02T23:53:51.122Z
Learning: In `src/table/block_index/iter.rs` and `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `trim_back_to_upper_bound` contains an unconditional restore (the `else { true }` branch at lines ~787-788 of decoder.rs) that always re-pushes a covering entry onto `hi_scanner.stack` when the stack would otherwise empty. This means `seek_upper_bound_cursor` always returns `true` for well-formed block data, regardless of `restart_interval`. Consequently, the `OwnedIndexBlockIter::from_block_with_bounds` returning `Ok(None)` (empty-child path) cannot be forced from test configuration (e.g., `use_index_block_restart_interval(4)`) on non-corrupt data — it is only reachable via `fill_stack`/`advance_upper_restart_interval` decode failure. Regression tests for the two-level index empty-child skip behavior (PR `#202`, issue `#194`) validate overall iteration correctness defensively, not the exact `Ok(None)` corruption path. Do not suggest forcing this path via `restart_interval` changes in reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 207
File: src/vlog/mod.rs:37-43
Timestamp: 2026-04-03T15:28:06.759Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` intentionally returns `Ok((vec![], vec![]))` when `fs.read_dir(folder)` yields `ErrorKind::NotFound`. Standard (non-blob) trees never create a blobs folder, so `ids` is empty in that case and the downstream `blob_files.len() < ids.len()` guard (0 < 0 = false) passes correctly. Missing blob files for actual blob trees are handled by callers via orphan detection. This mirrors the pre-PR `folder.try_exists()` early-return and is not a correctness bug. Do not flag this early return as skipping the missing-data guard in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/tree/mod.rs` (structured-world/coordinode-lsm-tree), the `Tree::open` TOCTOU probe on the `CURRENT` version file was fixed in PR `#212` (closes issue `#213`) by replacing the `exists()` guard with a direct `get_current_version()` read attempt. If `CURRENT` is absent the read returns `NotFound` immediately, eliminating the TOCTOU window. Do not flag a missing `exists()` guard before the CURRENT probe in future reviews of PR `#212` or later code.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/version/recovery.rs`:
- Around line 269-270: The test currently only checks result.is_err() after
calling recover(folder, &fs); tighten it to assert the specific
corruption/validation error variant instead of any error: replace the generic
assert!(result.is_err(), ...) with an assertion that matches the exact
corruption error returned by recover (e.g. using matches!(result,
Err(<expected_corruption_variant>)) or pattern-matching the Err to compare to
the corruption enum/variant). Do the same fix for the other occurrence around
lines 284-288 so both regression tests verify the expected corrupt table_count
error from recover().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cf31d57a-7a5d-4aff-9586-0711668276d8

📥 Commits

Reviewing files that changed from the base of the PR and between dd5f4a0 and 6043501.

📒 Files selected for processing (2)
  • src/tree/mod.rs
  • src/version/recovery.rs

Comment thread src/version/recovery.rs
Comment thread src/version/recovery.rs Outdated
…rsion_id

- Read and validate checksum_type tag (must be 0 = xxh3) in
  get_current_version() — detects torn/corrupt CURRENT files
  before attempting recovery with a potentially wrong version ID
- Remove stale TODO about checksum validation
- Tighten corruption test assertions to match Unrecoverable specifically
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai

Thread #19 (validate CURRENT checksum): Fixed in 49a2306get_current_version() now reads all 25 bytes (version_id + checksum + checksum_type) and validates checksum_type == 0 (xxh3). A corrupt/torn CURRENT file with invalid checksum_type will be rejected before recovery attempts to load a wrong version.

Thread #20 (tighten corruption test assertions): Fixed in 49a2306 — both tests now assert matches!(err, crate::Error::Unrecoverable) instead of generic is_err().

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

Tip

For best results, initiate chat on the files or code changes.

@polaz Both fixes are solid — thanks for the follow-through on these.

  • Thread #19: Validating checksum_type == 0 (xxh3) before reading version_id is exactly the right defense against torn writes on the CURRENT file — a partially-written 8-byte version_id that passes as a valid u64 but points to the wrong version is a subtle corruption class, and rejecting at the checksum_type field stops it before any recovery I/O begins.
  • Thread #20: Asserting matches!(err, crate::Error::Unrecoverable) instead of bare is_err() is the right level of precision — it validates that the bounds-check path returns the intended error variant, not just that it errors at all.

(ᓀ ᵕ ᓂ) — all threads closed on this one.


✏️ Learnings added
Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T10:11:16.990Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `get_current_version()` was updated in PR `#212` (commit 49a23065) to read all 25 bytes of the CURRENT file (version_id + checksum + checksum_type) and validate checksum_type == 0 (xxh3). A corrupt or torn CURRENT file with an invalid checksum_type is rejected immediately, before recovery attempts to load a potentially wrong version. Do not flag the absence of this CURRENT checksum validation in future reviews of PR `#212` or later code.

Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T10:11:16.990Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), the `table_count` and `blob_file_count` corruption unit tests were updated in PR `#212` (commit 49a23065) to assert `matches!(err, crate::Error::Unrecoverable)` instead of generic `is_err()`. This pins the exact error variant returned by the bounds-check path. Do not flag the absence of this precise assertion in future reviews of PR `#212` or later code.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), `SuperVersions::maintenance` GC loop was fixed in PR `#212` (commit ecfa47c8) to eliminate the TOCTOU between `fs.exists()` and `fs.remove_file()`. The canonical fix calls `fs.remove_file()` directly and treats `ErrorKind::NotFound` as success. Do not flag the absence of a preceding `exists()` guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/config/mod.rs` (structured-world/coordinode-lsm-tree), the `with_fs` and `with_shared_fs` constructor doc comments were updated (beyond the `Config::fs` field doc) in PR `#212` to remove stale "reopen limitation" notes, reflecting that MemFs-backed trees can now be fully reopened. Do not flag those constructor docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T07:20:56.398Z
Learning: In `src/config.rs` (structured-world/coordinode-lsm-tree), the `Config::fs` field documentation was updated in PR `#212` (commit 099f812b) to remove the "reopen unsupported" caveat. It now states that both fresh creation and reopening are supported for any `Fs` backend. Do not flag `Config::fs` docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:48:39.000Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `cleanup_orphaned_version` was updated in PR `#212` (commit 6043501b) so that `remove_file` treats `ErrorKind::NotFound` as success, mirroring the idempotent-delete pattern used in `SuperVersions::maintenance`. Do not flag the absence of a `NotFound` guard in `cleanup_orphaned_version` in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `tests/mem_fs_tree.rs` (structured-world/coordinode-lsm-tree), `memfs_reopen_recovers_flushed_data` was added in PR `#212` (commit c057154b). It exercises a full MemFs lifecycle: create a tree via `Config::with_shared_fs` backed by `Arc<dyn Fs>` (MemFs), insert keys, flush the active memtable, drop the tree, reopen from the same shared MemFs handle, and assert flushed keys/values are recovered. Do not flag the absence of a MemFs round-trip reopen test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.847Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T07:20:56.399Z
Learning: In `src/fs/mod.rs` (structured-world/coordinode-lsm-tree), the `Fs::remove_file` trait method was documented in PR `#212` (commit 099f812b) to require that implementations return `ErrorKind::NotFound` for absent files. This contract is load-bearing: `SuperVersions::maintenance` version GC calls `remove_file` directly and relies on `NotFound` as an idempotent no-op. Do not flag the absence of an `exists()` guard before `remove_file` in GC code in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:19:04.825Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::sync_directory` was hardened in PR `#208` (commit 76f3b676) to distinguish between a path that exists as a file (returns an error) and a path that does not exist (returns `NotFound`). Do not flag the absence of this distinction in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all maintenance-function unit tests were updated in PR `#212` (commit ecfa47c8) to use `MemFs` instead of `Path::new(".")` + `StdFs`, making them fully isolated from the real filesystem. Do not flag real-filesystem usage in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all 5 GC unit tests were updated in PR `#212` to seed distinct `v{id}` files (IDs 1–4) into MemFs before calling `maintenance`, and to assert that the correct files are deleted or preserved after the call. Do not flag the absence of seeded v* files in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:05:23.807Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the `tree_reopen_rejects_non_utf8_filename_in_data_dir` test was updated in PR `#212` (commit 6bf0fd00) to handle macOS APFS behavior: writing a non-UTF-8 filename returns `EILSEQ` (os error 92) because APFS rejects invalid byte sequences at the syscall level. The test now panics on unexpected write failures and skips only on known filesystem-reject errors (`EILSEQ` and equivalent), rather than using a broad early-return on any write error. Do not flag the absence of this EILSEQ handling in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the test `tree_reopen_rejects_non_utf8_filename_in_data_dir` was added in PR `#212`. It is Unix-only and asserts that `Tree::open` (via `cleanup_orphaned_version`) returns `InvalidData` when a non-UTF-8 `v*` file is injected into the data directory. This pins the intentional behavior change from `to_string_lossy` to strict UTF-8 in `Fs::read_dir`. Do not flag the absence of this test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:13:22.278Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the filesystem-reject check at lines 39–41 was updated in PR `#212` (commit 6bf0fd00) to use a `match` expression with an `is_filename_rejected()` helper instead of a bare `is_err()` guard. The helper encapsulates known filesystem-reject error codes (e.g., `EILSEQ`, os error 92 on macOS APFS) and makes the skip-vs-panic logic auditable in one place. Do not flag the absence of bare `is_err()` handling there in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:13:22.278Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the `tree_reopen_rejects_non_utf8_filename_in_data_dir` test was updated in PR `#212` (commit 77785031) to return `lsm_tree::Result<()>` and propagate errors with `?` instead of calling `unwrap()` on I/O paths. This satisfies the repo-wide no-`unwrap()`-on-I/O-paths rule. Do not flag `unwrap()` usage on I/O paths in that test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/tree/mod.rs` (structured-world/coordinode-lsm-tree), the `Tree::open` TOCTOU probe on the `CURRENT` version file was fixed in PR `#212` (closes issue `#213`) by replacing the `exists()` guard with a direct `get_current_version()` read attempt. If `CURRENT` is absent the read returns `NotFound` immediately, eliminating the TOCTOU window. Do not flag a missing `exists()` guard before the CURRENT probe in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 202
File: src/table/tests.rs:2113-2118
Timestamp: 2026-04-02T23:53:51.122Z
Learning: In `src/table/block_index/iter.rs` and `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `trim_back_to_upper_bound` contains an unconditional restore (the `else { true }` branch at lines ~787-788 of decoder.rs) that always re-pushes a covering entry onto `hi_scanner.stack` when the stack would otherwise empty. This means `seek_upper_bound_cursor` always returns `true` for well-formed block data, regardless of `restart_interval`. Consequently, the `OwnedIndexBlockIter::from_block_with_bounds` returning `Ok(None)` (empty-child path) cannot be forced from test configuration (e.g., `use_index_block_restart_interval(4)`) on non-corrupt data — it is only reachable via `fill_stack`/`advance_upper_restart_interval` decode failure. Regression tests for the two-level index empty-child skip behavior (PR `#202`, issue `#194`) validate overall iteration correctness defensively, not the exact `Ok(None)` corruption path. Do not suggest forcing this path via `restart_interval` changes in reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `recover()` was updated in PR `#212` to bounds-check `table_count` and `blob_file_count` at a maximum of 10,000,000 (10 M) before calling `Vec::with_capacity`. This prevents huge heap allocations from corrupt version files. Do not flag the absence of this guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T22:48:06.079Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the "binary search trusts semantically-corrupted binary index" concern (lines 245-299) is explicitly out of scope for PR `#182`. The scenario requires a corrupted binary index that passes the block checksum (encoder bug or checksum collision). Adding O(n) monotonicity validation per seek call would regress the hot path for a case already caught by checksums. This hardening is tracked in issue `#184` (fallible decoder constructor). Do not flag the absence of monotonicity validation on the binary-index seek path in reviews of PR `#182` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T14:57:51.527Z
Learning: In `src/table/meta.rs` (structured-world/coordinode-lsm-tree), meta fields (e.g., `table_version`, `filter_hash_type`, `checksum_type`, compression keys) can still panic on corrupt input via `read_u8!`/`read_u64!` macro paths. This is out of scope for PR `#191` (which focuses on trailer field validation). The broader meta-field hardening is tracked in issue `#193`. Do not re-flag this as an unaddressed gap in reviews of PR `#191` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-01T21:25:58.275Z
Learning: In `src/table/scanner.rs` (structured-world/coordinode-lsm-tree), after commit d6dde989 in PR `#191`, the `fetch_next_block` error path sets `self.read_count = self.block_count` before returning the error, exhausting the scanner on mid-scan block-fetch failures — consistent with the `try_new` error path (commit d66197d). Both error paths now poison the scanner so callers cannot silently skip corrupt blocks. Do not flag the absence of this scanner-poison behavior on `fetch_next_block` errors in reviews of PR `#191` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:44:00.075Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), parent-directory existence validation is centralized in an `ensure_parent_dir()` helper shared by `MemFs::open()` and `MemFs::rename()`. This was extracted in PR `#208`, eliminating duplicated logic and allowing the `#[expect(clippy::too_many_lines)]` attribute to be removed from `open()`. Do not flag duplicated parent-dir checks between `open()` and `rename()` as a gap in future reviews.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/version/recovery.rs`:
- Around line 219-247: In write_corrupt_table_count add a short inline comment
before the writes for "tree_type" and "blob_gc_stats" explaining these sections
are written only to satisfy recover()'s expectation that all sections exist
(even though the test targets "tables"), e.g., note that recover() requires
tree_type, tables, blob_files and blob_gc_stats to be present so we construct
minimal valid sections for the others; reference the function name
write_corrupt_table_count and the section names "tree_type" and "blob_gc_stats"
in the comment.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c3796ed9-43a5-4cf3-94b4-00bb53e37821

📥 Commits

Reviewing files that changed from the base of the PR and between 6043501 and 49a2306.

📒 Files selected for processing (1)
  • src/version/recovery.rs

Comment thread src/version/recovery.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai resume

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

✅ Actions performed

Reviews resumed.

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai Thread #21: Fixed in 6101615 — added doc comments to both test helpers explaining why all four sfa sections are written.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

Tip

For best results, initiate chat on the files or code changes.

@polaz Acknowledged — adding doc comments that explain why all four sfa sections are always written is the right approach. It makes the test helper intent self-documenting and removes any ambiguity about whether the full section set is incidental or required by the format.


✏️ Learnings added
Learnt from: polaz
URL: https://github.com/structured-world/coordinode-lsm-tree/pull/212

Timestamp: 2026-04-05T10:23:54.459Z
Learning: In `src/manifest.rs` (structured-world/coordinode-lsm-tree), both manifest test helpers (`write_test_manifest` and the related decode helper) were updated in PR `#212` (commit 61016156) with doc comments explaining why all four sfa sections (format_version, tree_type, level_count, filter_hash_type) must be written — the sfa archive format requires all sections to be present for a valid manifest. Do not flag the absence of this explanation in future reviews of PR `#212` or later code.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), `SuperVersions::maintenance` GC loop was fixed in PR `#212` (commit ecfa47c8) to eliminate the TOCTOU between `fs.exists()` and `fs.remove_file()`. The canonical fix calls `fs.remove_file()` directly and treats `ErrorKind::NotFound` as success. Do not flag the absence of a preceding `exists()` guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/config/mod.rs` (structured-world/coordinode-lsm-tree), the `with_fs` and `with_shared_fs` constructor doc comments were updated (beyond the `Config::fs` field doc) in PR `#212` to remove stale "reopen limitation" notes, reflecting that MemFs-backed trees can now be fully reopened. Do not flag those constructor docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T07:20:56.398Z
Learning: In `src/config.rs` (structured-world/coordinode-lsm-tree), the `Config::fs` field documentation was updated in PR `#212` (commit 099f812b) to remove the "reopen unsupported" caveat. It now states that both fresh creation and reopening are supported for any `Fs` backend. Do not flag `Config::fs` docs as stale or missing reopen guidance in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:48:39.000Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `cleanup_orphaned_version` was updated in PR `#212` (commit 6043501b) so that `remove_file` treats `ErrorKind::NotFound` as success, mirroring the idempotent-delete pattern used in `SuperVersions::maintenance`. Do not flag the absence of a `NotFound` guard in `cleanup_orphaned_version` in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `tests/mem_fs_tree.rs` (structured-world/coordinode-lsm-tree), `memfs_reopen_recovers_flushed_data` was added in PR `#212` (commit c057154b). It exercises a full MemFs lifecycle: create a tree via `Config::with_shared_fs` backed by `Arc<dyn Fs>` (MemFs), insert keys, flush the active memtable, drop the tree, reopen from the same shared MemFs handle, and assert flushed keys/values are recovered. Do not flag the absence of a MemFs round-trip reopen test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T20:34:35.847Z
Learning: In `src/vlog/mod.rs` (structured-world/coordinode-lsm-tree), `recover_blob_files` defers all `insert_for_blob_file` calls into a `pending_cache_inserts` buffer and commits them only after the missing-file guard (`blob_files.len() < ids.len()`) passes. This was fixed in PR `#208` to prevent leaking FDs into the descriptor-table cache on partial/failed recovery. Do not flag the absence of this deferral as a gap in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T07:20:56.399Z
Learning: In `src/fs/mod.rs` (structured-world/coordinode-lsm-tree), the `Fs::remove_file` trait method was documented in PR `#212` (commit 099f812b) to require that implementations return `ErrorKind::NotFound` for absent files. This contract is load-bearing: `SuperVersions::maintenance` version GC calls `remove_file` directly and relies on `NotFound` as an idempotent no-op. Do not flag the absence of an `exists()` guard before `remove_file` in GC code in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-04T16:19:04.825Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::sync_directory` was hardened in PR `#208` (commit 76f3b676) to distinguish between a path that exists as a file (returns an error) and a path that does not exist (returns `NotFound`). Do not flag the absence of this distinction in future reviews of PR `#208` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T01:09:33.224Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all maintenance-function unit tests were updated in PR `#212` (commit ecfa47c8) to use `MemFs` instead of `Path::new(".")` + `StdFs`, making them fully isolated from the real filesystem. Do not flag real-filesystem usage in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 208
File: src/fs/mem_fs.rs:285-300
Timestamp: 2026-04-04T11:18:56.220Z
Learning: In `src/fs/mem_fs.rs` (structured-world/coordinode-lsm-tree), `MemFs::open()` and `MemFs::rename()` intentionally only check the immediate parent directory, not the full ancestor chain. lsm-tree always calls `create_dir_all` before creating files, so intermediate file-vs-dir conflicts can only arise from external filesystem corruption, which MemFs is not responsible for defending against. A full ancestor walk would add O(depth) cost to every `open()` call. Do not flag the absence of full ancestor-chain validation in MemFs::open() or MemFs::rename() in future reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:05:23.807Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the `tree_reopen_rejects_non_utf8_filename_in_data_dir` test was updated in PR `#212` (commit 6bf0fd00) to handle macOS APFS behavior: writing a non-UTF-8 filename returns `EILSEQ` (os error 92) because APFS rejects invalid byte sequences at the syscall level. The test now panics on unexpected write failures and skips only on known filesystem-reject errors (`EILSEQ` and equivalent), rather than using a broad early-return on any write error. Do not flag the absence of this EILSEQ handling in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the test `tree_reopen_rejects_non_utf8_filename_in_data_dir` was added in PR `#212`. It is Unix-only and asserts that `Tree::open` (via `cleanup_orphaned_version`) returns `InvalidData` when a non-UTF-8 `v*` file is injected into the data directory. This pins the intentional behavior change from `to_string_lossy` to strict UTF-8 in `Fs::read_dir`. Do not flag the absence of this test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:13:22.278Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the filesystem-reject check at lines 39–41 was updated in PR `#212` (commit 6bf0fd00) to use a `match` expression with an `is_filename_rejected()` helper instead of a bare `is_err()` guard. The helper encapsulates known filesystem-reject error codes (e.g., `EILSEQ`, os error 92 on macOS APFS) and makes the skip-vs-panic logic auditable in one place. Do not flag the absence of bare `is_err()` handling there in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T09:13:22.278Z
Learning: In `tests/tree_non_utf8.rs` (structured-world/coordinode-lsm-tree), the `tree_reopen_rejects_non_utf8_filename_in_data_dir` test was updated in PR `#212` (commit 77785031) to return `lsm_tree::Result<()>` and propagate errors with `?` instead of calling `unwrap()` on I/O paths. This satisfies the repo-wide no-`unwrap()`-on-I/O-paths rule. Do not flag `unwrap()` usage on I/O paths in that test in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/version/super_version.rs` (structured-world/coordinode-lsm-tree), all 5 GC unit tests were updated in PR `#212` to seed distinct `v{id}` files (IDs 1–4) into MemFs before calling `maintenance`, and to assert that the correct files are deleted or preserved after the call. Do not flag the absence of seeded v* files in those tests in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/tree/mod.rs` (structured-world/coordinode-lsm-tree), the `Tree::open` TOCTOU probe on the `CURRENT` version file was fixed in PR `#212` (closes issue `#213`) by replacing the `exists()` guard with a direct `get_current_version()` read attempt. If `CURRENT` is absent the read returns `NotFound` immediately, eliminating the TOCTOU window. Do not flag a missing `exists()` guard before the CURRENT probe in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T22:48:06.079Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the "binary search trusts semantically-corrupted binary index" concern (lines 245-299) is explicitly out of scope for PR `#182`. The scenario requires a corrupted binary index that passes the block checksum (encoder bug or checksum collision). Adding O(n) monotonicity validation per seek call would regress the hot path for a case already caught by checksums. This hardening is tracked in issue `#184` (fallible decoder constructor). Do not flag the absence of monotonicity validation on the binary-index seek path in reviews of PR `#182` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 202
File: src/table/tests.rs:2113-2118
Timestamp: 2026-04-02T23:53:51.122Z
Learning: In `src/table/block_index/iter.rs` and `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `trim_back_to_upper_bound` contains an unconditional restore (the `else { true }` branch at lines ~787-788 of decoder.rs) that always re-pushes a covering entry onto `hi_scanner.stack` when the stack would otherwise empty. This means `seek_upper_bound_cursor` always returns `true` for well-formed block data, regardless of `restart_interval`. Consequently, the `OwnedIndexBlockIter::from_block_with_bounds` returning `Ok(None)` (empty-child path) cannot be forced from test configuration (e.g., `use_index_block_restart_interval(4)`) on non-corrupt data — it is only reachable via `fill_stack`/`advance_upper_restart_interval` decode failure. Regression tests for the two-level index empty-child skip behavior (PR `#202`, issue `#194`) validate overall iteration correctness defensively, not the exact `Ok(None)` corruption path. Do not suggest forcing this path via `restart_interval` changes in reviews.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T10:11:16.990Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), the `table_count` and `blob_file_count` corruption unit tests were updated in PR `#212` (commit 49a23065) to assert `matches!(err, crate::Error::Unrecoverable)` instead of generic `is_err()`. This pins the exact error variant returned by the bounds-check path. Do not flag the absence of this precise assertion in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-27T07:10:44.958Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the restart-key span validation is intentionally split: strict bounds hardening (buffer/offset checks) was implemented in PR `#182`, while full semantic restart-key span validation (which requires passing the restart-key span through decoder state) is tracked separately in issue `#183`. Do not flag the absence of semantic-span validation in decoder.rs as an unaddressed gap in reviews of PR `#182` or earlier code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T10:11:16.990Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `get_current_version()` was updated in PR `#212` (commit 49a23065) to read all 25 bytes of the CURRENT file (version_id + checksum + checksum_type) and validate checksum_type == 0 (xxh3). A corrupt or torn CURRENT file with an invalid checksum_type is rejected immediately, before recovery attempts to load a potentially wrong version. Do not flag the absence of this CURRENT checksum validation in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 182
File: src/table/block/decoder.rs:366-372
Timestamp: 2026-03-30T13:21:41.943Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), `poison_back_cursor()` now delegates to `clamp_upper_to_lo()` instead of clearing `hi_scanner.base_key_offset` to `None`. This preserves a hard forward upper bound (`base_key_offset = Some(0)`) so that a caller switching from `next_back()` to `next()` after corruption cannot resume past the poisoned interval. Regression test `poison_back_cursor_also_stops_forward_next` covers this path. Do not flag `poison_back_cursor` as fail-open in reviews of PR `#182` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-05T08:16:42.437Z
Learning: In `src/version/recovery.rs` (structured-world/coordinode-lsm-tree), `recover()` was updated in PR `#212` to bounds-check `table_count` and `blob_file_count` at a maximum of 10,000,000 (10 M) before calling `Vec::with_capacity`. This prevents huge heap allocations from corrupt version files. Do not flag the absence of this guard in future reviews of PR `#212` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 182
File: src/table/block/decoder.rs:104-110
Timestamp: 2026-03-27T12:08:02.765Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the `assert!(restart_interval > 0, ...)` in `Decoder::new` is an intentional placeholder: the proper fix — a fallible decoder constructor that propagates a corruption error instead of panicking — is a wider refactor tracked in issue `#184`. A `TODO(`#184`)` comment was added at the assert site in PR `#182`. Do not re-flag this assert as a missing fallible validation in reviews of PR `#182` or earlier code; it is a known, tracked follow-up.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-04-01T21:25:58.275Z
Learning: In `src/table/scanner.rs` (structured-world/coordinode-lsm-tree), after commit d6dde989 in PR `#191`, the `fetch_next_block` error path sets `self.read_count = self.block_count` before returning the error, exhausting the scanner on mid-scan block-fetch failures — consistent with the `try_new` error path (commit d66197d). Both error paths now poison the scanner so callers cannot silently skip corrupt blocks. Do not flag the absence of this scanner-poison behavior on `fetch_next_block` errors in reviews of PR `#191` or later code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T14:57:51.527Z
Learning: In `src/table/meta.rs` (structured-world/coordinode-lsm-tree), meta fields (e.g., `table_version`, `filter_hash_type`, `checksum_type`, compression keys) can still panic on corrupt input via `read_u8!`/`read_u64!` macro paths. This is out of scope for PR `#191` (which focuses on trailer field validation). The broader meta-field hardening is tracked in issue `#193`. Do not re-flag this as an unaddressed gap in reviews of PR `#191` or earlier code.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-30T04:38:24.388Z
Learning: In `src/table/index_block/block_handle.rs` (structured-world/coordinode-lsm-tree), `parse_full` bounds hardening, `#[must_use]` annotations on parser methods, and remaining `value_type` unwrap paths in `parse_full` are pre-existing code outside the diff of PR `#182`. The broader hardening (including a fallible decoder constructor) is tracked in issue `#184`. Do not flag these as unaddressed gaps in reviews of PR `#182`; they are known, tracked follow-ups.

Learnt from: polaz
Repo: structured-world/coordinode-lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-31T14:37:26.098Z
Learning: In `src/table/block/decoder.rs` (structured-world/coordinode-lsm-tree), the `unwrap!` calls on `read_u8`/`read_u32` inside `Decoder::try_new()` (lines ~139–151) will panic on a truncated trailer slice. This is guarded at a higher level: `Block::from_reader` and checksum validation reject undersized blocks before they reach the decoder, and `Trailer::trailer_offset()` would underflow on undersized blocks — making this a pre-existing pipeline invariant, not introduced by PR `#191`. Making `Trailer::as_slice()` and its downstream reads fallible to handle truncated/undersized blocks gracefully is tracked in issue `#192`. Do not re-flag the `unwrap!` reads in `Decoder::try_new()` as an unaddressed gap in reviews of PR `#191` or earlier code.

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 5, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Comment thread src/version/recovery.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.

@polaz polaz merged commit 5b6eed9 into main Apr 5, 2026
20 of 21 checks passed
@polaz polaz deleted the feat/#209-refactorfs-migrate-treeopen-recovery-path-to-fs-tr branch April 5, 2026 13:17
@sw-release-bot sw-release-bot Bot mentioned this pull request Apr 5, 2026
polaz pushed a commit that referenced this pull request Apr 5, 2026
## 🤖 New release

* `coordinode-lsm-tree`: 4.2.0 -> 4.3.0

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

##
[4.3.0](v4.2.0...v4.3.0)
- 2026-04-05

### Added

- *(fs)* MemFs — in-memory Fs implementation for testing and in-memory
trees
([#211](#211))

### Fixed

- *(table)* validate block type on cache-hit path
([#203](#203))
- *(table)* two-level index scan stops prematurely on empty child
partitions
([#202](#202))

### Performance

- *(table)* add infallible OwnedIndexBlockIter constructor for
pre-validated blocks
([#206](#206))

### Refactored

- *(fs)* migrate Tree::open recovery path to Fs trait
([#212](#212))
- *(table)* make index block bound-cursor helpers fallible
([#205](#205))
- *(table)* make all meta/trailer reads fallible for truncated blocks
([#204](#204))
- *(table)* make block decoder trailer validation fallible
([#199](#199))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: sw-release-bot[bot] <255865126+sw-release-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor(tree): eliminate TOCTOU race in Tree::open CURRENT file probe refactor(fs): migrate Tree::open recovery path to Fs trait

2 participants