feat(fs): MemFs — in-memory Fs implementation for testing and in-memory trees#207
feat(fs): MemFs — in-memory Fs implementation for testing and in-memory trees#207polaz wants to merge 16 commits into
Conversation
- Implement MemFs + MemFile backed by HashMap<PathBuf, Vec<u8>> - Add Config::with_fs() builder for plugging alternative Fs backends - De-generify Config<F: Fs> to Config with Arc<dyn Fs> (zero API breakage) - Rewrite rewrite_atomic to use Fs trait instead of tempfile crate (#188) - Pipe Arc<dyn Fs> through Table::recover, FileAccessor, load_block, and vlog paths so file opens go through the pluggable Fs on cache miss - Add 20 unit tests for MemFs + 5 integration tests (open, write, flush, read, delete, range, multiple flushes, shared MemFs across trees) Known limitations: compaction and tree reopen not yet fully supported with MemFs (remaining std::fs bypass points). Closes #187 Closes #188
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughIntroduces an in-memory filesystem backend ( Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
Adds a fully in-memory Fs backend (MemFs) and threads a pluggable Arc<dyn Fs> through table/vlog access paths so tests and (eventually) in-memory trees can run without touching the host filesystem, including refactoring rewrite_atomic to be Fs-based.
Changes:
- Introduce
MemFs/MemFileand export it fromcrate::fs. - Switch
Configto storeArc<dyn Fs>and addConfig::with_fs()to set an alternative backend. - Route descriptor-cache miss opens (tables + vlog blob files) through
Fsand update a variety of call sites/tests accordingly.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/mem_fs_tree.rs | New integration tests covering opening/flush/range ops using MemFs. |
| src/vlog/mod.rs | Pass Fs into vlog recovery and use it for blob-file opens. |
| src/vlog/blob_file/writer.rs | Import reordering to match broader refactor. |
| src/vlog/blob_file/scanner.rs | Import reordering in scanner + tests. |
| src/vlog/blob_file/reader.rs | Import reordering in reader + tests. |
| src/vlog/blob_file/multi_writer.rs | Use Fs for read-back open and propagate Arc<dyn Fs> into FileAccessor. |
| src/vlog/blob_file/mod.rs | Import reordering. |
| src/vlog/blob_file/meta.rs | Import reordering. |
| src/vlog/blob_file/merge.rs | Import reordering. |
| src/vlog/accessor.rs | Route blob-file FD cache miss opens through FileAccessor (and thus Fs). |
| src/tree/mod.rs | Plumb config.fs into table recovery and vlog recovery. |
| src/tree/ingest.rs | Pass fs through ingestion-created table recovery paths. |
| src/table/util.rs | Use FileAccessor::get_or_open_table() for FD cache miss opens. |
| src/table/tests.rs | Update tests for new FileAccessor::DescriptorTable { .. } shape and new fs param. |
| src/table/multi_writer.rs | Update tests/call sites for new table recovery signature including fs. |
| src/table/mod.rs | Table recovery now opens via Fs; store fs on Table::Inner; FileAccessor now carries fs for cache misses. |
| src/table/inner.rs | Cleanup (drop) path uses Fs::remove_file rather than std::fs. |
| src/fs/mod.rs | Export new MemFs backend. |
| src/fs/mem_fs.rs | New MemFs implementation + unit tests. |
| src/file.rs | Refactor rewrite_atomic to create temp files via Fs instead of tempfile. |
| src/file_accessor.rs | FileAccessor::DescriptorTable now includes { table, fs } and adds get-or-open helpers. |
| src/config/mod.rs | Config now stores Arc<dyn Fs> and adds .with_fs() builder. |
| src/compaction/flavour.rs | Pass fs through compaction table recovery path. |
| src/blob_tree/mod.rs | Plumb fs into blob-tree table recovery. |
| src/blob_tree/ingest.rs | Pass fs through blob-tree ingestion recovery path. |
| src/blob_tree/gc.rs | Import ordering only. |
| benches/index_block.rs | Update benchmark table recovery call signature to include fs. |
Comments suppressed due to low confidence (2)
src/vlog/mod.rs:88
recover_blob_filesstill callssfa::Reader::new(&blob_file_path), which (like otherReader::new(path)uses) will open viastd::fsand bypass the pluggableFs. To keep recovery I/O fully routed through the configured backend, prefersfa::Reader::from_reader(&mut file)using theFs-opened handle (similar toTable::recover).
let file = fs.open(&blob_file_path, &crate::fs::FsOpenOptions::new().read(true))?;
let meta = {
let reader = sfa::Reader::new(&blob_file_path)?;
let toc = reader.toc();
src/table/util.rs:90
- This refactor removed the
metrics.table_file_opened_cached/uncachedincrements that were previously emitted on descriptor-table cache hit/miss inload_block. As a result,Metrics::table_file_cache_hit_rate()can become misleading (often returning 1.0 because both counters stay at 0 outside recovery). Consider reintroducing these increments—e.g., by havingFileAccessor::get_or_open_tablereport hit vs miss (or accept an optional metrics handle).
return Ok(block);
}
let fd = file_accessor.get_or_open_table(&table_id, path)?;
let block = Block::from_file(
fd.as_ref(),
*handle,
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/blob_tree/ingest.rs (1)
223-231:⚠️ Potential issue | 🟠 MajorRecovery FS should match the table writer FS backend.
This call uses
index.config.fs, but the index tables produced byTableIngestionare written via level-routed filesystem selection. On non-default routes, this can break recovery/open of newly ingested tables.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/blob_tree/ingest.rs` around lines 223 - 231, The Table::recover call is using index.config.fs but TableIngestion writes index tables via level-routed filesystem selection; change the fs argument to the same level-routed FS used when writing the table (i.e. obtain the route-specific FS for this table id the same way TableIngestion does) so recovery/open uses the identical backend; specifically, replace the index.config.fs parameter passed into Table::recover with the result of the level-routed FS resolver used by TableIngestion (call the same helper used there or add a helper like resolve_level_routed_fs(&index.config, &table_id) and pass that).src/tree/ingest.rs (1)
307-315:⚠️ Potential issue | 🟠 MajorRecover is using the wrong filesystem backend for level-routed tables.
MultiWriteris created withlevel_fs, but recovery usesself.tree.config.fs. With route-specific backends, this can fail to reopen freshly ingested tables from the correct backend.Suggested fix
pub struct Ingestion<'a> { pub(crate) folder: PathBuf, + level_fs: Arc<dyn crate::fs::Fs>, tree: &'a Tree, @@ - Ok(Self { + Ok(Self { folder, + level_fs, tree, writer, @@ Table::recover( self.folder.join(table_id.to_string()), @@ - self.tree.config.fs.clone(), + self.level_fs.clone(), false,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tree/ingest.rs` around lines 307 - 315, The recovery call is using the global filesystem instead of the route-specific one, causing reopen failures for level-routed tables; update the Table::recover invocation (the call near MultiWriter creation that currently passes self.tree.config.fs.clone()) to pass the route-specific level_fs (or level_fs.clone()) used to construct MultiWriter so recovery opens files from the correct backend (keep all other arguments the same and only replace self.tree.config.fs.clone() with level_fs).src/blob_tree/mod.rs (1)
540-548:⚠️ Potential issue | 🟠 MajorUse the routed
level_fsfor recovery here, notconfig.fs.These tables are written via
level_fsfromtables_folder_for_level(0). Recovering withself.index.config.fscan target a different backend and break table reopen on routed configurations.Suggested fix
Table::recover( table_folder.join(table_id.to_string()), @@ - self.index.config.fs.clone(), + level_fs.clone(), pin_filter,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/blob_tree/mod.rs` around lines 540 - 548, The Table::recover call is using self.index.config.fs which can target the wrong backend for routed setups; instead obtain and pass the routed level_fs used by tables_folder_for_level(0) (the same filesystem that wrote the tables) into Table::recover in place of self.index.config.fs so recovery uses the level-specific backend; update the Table::recover invocation (the call that currently passes self.index.config.fs) to pass the level_fs returned/used by tables_folder_for_level(0) (or the index method that yields the routed level filesystem) while keeping the other args (table_folder.join(table_id.to_string()), checksum, 0, self.index.id, self.index.config.cache, self.index.config.descriptor_table, pin_filter) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/file.rs`:
- Around line 66-67: The temp filename generation using TEMP_SEQ (let seq =
TEMP_SEQ.fetch_add(1, Ordering::Relaxed); let tmp_path =
folder.join(format!(".tmp_{seq}")); ) is only unique per-process; modify it to
include an additional per-process or per-file component (e.g.,
std::process::id() or a short random/salted UUID suffix) so concurrent processes
cannot collide. Update the format string used in folder.join to combine seq with
process::id() or a generated random string (ensuring any RNG initialization is
deterministic/safe for your context) and keep the TEMP_SEQ increment for
intra-process ordering.
- Around line 69-81: The temp file at tmp_path can leak if fs.rename(path)
fails; update the block around fs.rename(&tmp_path, path)? to remove the tmp
file on error (e.g., call fs.remove_file(&tmp_path) or equivalent) before
returning the original error, and ensure any errors from the cleanup are
handled/logged but do not hide the rename error; alternatively wrap the
temp-file lifecycle in a guard (scope guard or RAII) so that FsFile::sync_all,
fs.rename, and fsync_directory(folder, fs) still clean up tmp_path on failure.
In `@src/fs/mem_fs.rs`:
- Around line 377-388: The rename implementation only moves a single file
(state.files.remove(from)) and returns NotFound for directories; update rename
to support directory moves by detecting when any keys in state.files equal
`from` or have `from` as a prefix: acquire `state` via write_state(&self.state)
as before, iterate over `state.files` keys to find entries where path == from or
path.strip_prefix(from) succeeds, collect those entries, remove them, and
re-insert them under the new path by computing `to.join(relative)` (use
Path::strip_prefix to compute `relative`); if no entries found still return the
NotFound error. Keep all modifications within the write_state scope to preserve
atomicity and continue using the same error message and io::ErrorKind.
In `@src/table/util.rs`:
- Line 86: The call to file_accessor.get_or_open_table in util.rs hides whether
the FD was served from the descriptor cache, breaking table_file_cache_hit_rate;
update get_or_open_table (in src/file_accessor.rs) to either return a hit/miss
flag (e.g., (fd, cached: bool)) or perform the metrics increments internally,
then propagate that change to the util.rs call site so that
table_file_opened_cached/table_file_opened_uncached are incremented correctly
(or the returned flag is used to increment them where util.rs currently calls
get_or_open_table).
In `@src/vlog/mod.rs`:
- Line 31: The blob recovery code is currently mixing the injected Fs (parameter
fs) with host filesystem calls (Path::try_exists and std::fs::read_dir), causing
missed blobs on non-StdFs backends; update the recovery scan to use the Fs trait
methods via the provided Arc<dyn Fs> (fs) for existence checks and directory
enumeration instead of Path::try_exists()/std::fs::read_dir(), and ensure the
same Fs-backed open logic already used later (the code that opens files through
fs) is used for discovery so MemFs and other backends see the same files.
---
Outside diff comments:
In `@src/blob_tree/ingest.rs`:
- Around line 223-231: The Table::recover call is using index.config.fs but
TableIngestion writes index tables via level-routed filesystem selection; change
the fs argument to the same level-routed FS used when writing the table (i.e.
obtain the route-specific FS for this table id the same way TableIngestion does)
so recovery/open uses the identical backend; specifically, replace the
index.config.fs parameter passed into Table::recover with the result of the
level-routed FS resolver used by TableIngestion (call the same helper used there
or add a helper like resolve_level_routed_fs(&index.config, &table_id) and pass
that).
In `@src/blob_tree/mod.rs`:
- Around line 540-548: The Table::recover call is using self.index.config.fs
which can target the wrong backend for routed setups; instead obtain and pass
the routed level_fs used by tables_folder_for_level(0) (the same filesystem that
wrote the tables) into Table::recover in place of self.index.config.fs so
recovery uses the level-specific backend; update the Table::recover invocation
(the call that currently passes self.index.config.fs) to pass the level_fs
returned/used by tables_folder_for_level(0) (or the index method that yields the
routed level filesystem) while keeping the other args
(table_folder.join(table_id.to_string()), checksum, 0, self.index.id,
self.index.config.cache, self.index.config.descriptor_table, pin_filter)
unchanged.
In `@src/tree/ingest.rs`:
- Around line 307-315: The recovery call is using the global filesystem instead
of the route-specific one, causing reopen failures for level-routed tables;
update the Table::recover invocation (the call near MultiWriter creation that
currently passes self.tree.config.fs.clone()) to pass the route-specific
level_fs (or level_fs.clone()) used to construct MultiWriter so recovery opens
files from the correct backend (keep all other arguments the same and only
replace self.tree.config.fs.clone() with level_fs).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2104bd5a-4bdf-4935-934c-df0341cf6223
📒 Files selected for processing (27)
benches/index_block.rssrc/blob_tree/gc.rssrc/blob_tree/ingest.rssrc/blob_tree/mod.rssrc/compaction/flavour.rssrc/config/mod.rssrc/file.rssrc/file_accessor.rssrc/fs/mem_fs.rssrc/fs/mod.rssrc/table/inner.rssrc/table/mod.rssrc/table/multi_writer.rssrc/table/tests.rssrc/table/util.rssrc/tree/ingest.rssrc/tree/mod.rssrc/vlog/accessor.rssrc/vlog/blob_file/merge.rssrc/vlog/blob_file/meta.rssrc/vlog/blob_file/mod.rssrc/vlog/blob_file/multi_writer.rssrc/vlog/blob_file/reader.rssrc/vlog/blob_file/scanner.rssrc/vlog/blob_file/writer.rssrc/vlog/mod.rstests/mem_fs_tree.rs
…ntics - Pass level-routed fs (not config.fs) to all 6 Table::recover call sites so recovery uses the same backend that wrote the table - Store level_fs in Ingestion struct, expose MultiWriter::fs as pub(crate) - Return (fd, cache_hit) from FileAccessor::get_or_open_table to restore table_file_opened_cached/uncached metrics in load_block - MemFs::open sets writable from opts.write || opts.append (not always true) - MemFs::remove_dir_all returns NotFound for nonexistent paths - MemFs::rename validates destination parent directory exists - rewrite_atomic: best-effort temp file cleanup on rename failure - vlog recovery: replace try_exists + std::fs::read_dir with Fs trait calls
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/vlog/mod.rs (1)
77-92:⚠️ Potential issue | 🟠 MajorBlob recovery still bypasses
Fswhen it reads the SFA trailer.
fs.open(blob_file_path, ...)gets the correct backend handle, butsfa::Reader::new(blob_file_path)immediately reopens the path through the host filesystem. OnMemFs, blob recovery will still fail here even though directory discovery is now routed throughFs.src/table/mod.rsalready uses the right pattern (sfa::Reader::from_reader(&mut file)).🔧 Suggested fix
- let file = fs.open(blob_file_path, &crate::fs::FsOpenOptions::new().read(true))?; + let mut file = fs.open(blob_file_path, &crate::fs::FsOpenOptions::new().read(true))?; let meta = { - let reader = sfa::Reader::new(blob_file_path)?; + let reader = sfa::Reader::from_reader(&mut file)?; let toc = reader.toc();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/vlog/mod.rs` around lines 77 - 92, Replace the direct path-based SFA reader with a reader built from the already-opened Fs handle: make the opened `file` mutable (change `let file = fs.open(...)` to `let mut file = ...`), replace `sfa::Reader::new(blob_file_path)?` with `sfa::Reader::from_reader(&mut file)?`, and keep the subsequent `toc`, `metadata_section`, and `crate::file::read_exact(&*file, ...)` usage (ensuring `file` implements Read+Seek after making it mutable). This ensures SFA trailer parsing uses the Fs-backed handle (refer to `file`, `sfa::Reader::from_reader`, and `metadata_section`).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/file.rs`:
- Around line 58-79: Tmp files created at tmp_path (via TEMP_SEQ, folder and
fs.open/FsOpenOptions) can leak if write_all, flush or FsFile::sync_all return
an error; ensure every pre-rename failure removes the temp file by adding an
unlink-on-error guard around tmp_path: either create an RAII guard type (e.g.,
TempUnlinker) constructed with tmp_path that calls fs.remove_file(&tmp_path) in
Drop unless disarmed, or wrap the write/flush/sync sequence in a closure that,
on any Err, explicitly calls fs.remove_file(&tmp_path) before returning the
error; disarm the guard (or skip removal) only after fs.rename(&tmp_path, path)
succeeds.
In `@src/fs/mem_fs.rs`:
- Around line 283-288: The existing-file branch returns MemFile with writable
set to opts.read (currently opts.write) which misses append-only reopens; update
the Ok(Box::new(MemFile { ... })) in that branch so the writable field is set to
opts.write || opts.append (matching the create paths) while keeping is_append =
opts.append and readable = opts.read; adjust the MemFile construction in that
branch to use the combined condition.
---
Outside diff comments:
In `@src/vlog/mod.rs`:
- Around line 77-92: Replace the direct path-based SFA reader with a reader
built from the already-opened Fs handle: make the opened `file` mutable (change
`let file = fs.open(...)` to `let mut file = ...`), replace
`sfa::Reader::new(blob_file_path)?` with `sfa::Reader::from_reader(&mut file)?`,
and keep the subsequent `toc`, `metadata_section`, and
`crate::file::read_exact(&*file, ...)` usage (ensuring `file` implements
Read+Seek after making it mutable). This ensures SFA trailer parsing uses the
Fs-backed handle (refer to `file`, `sfa::Reader::from_reader`, and
`metadata_section`).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2682dd3f-27b8-47ad-803b-ecedc05b45a4
📒 Files selected for processing (14)
src/blob_tree/ingest.rssrc/blob_tree/mod.rssrc/compaction/flavour.rssrc/file.rssrc/file_accessor.rssrc/fs/mem_fs.rssrc/table/mod.rssrc/table/multi_writer.rssrc/table/util.rssrc/tree/ingest.rssrc/tree/mod.rssrc/vlog/accessor.rssrc/vlog/mod.rstests/mem_fs_tree.rs
…g sfa bypass - MemFs::open existing-file branch: writable = opts.write || opts.append (matches create paths and StdFs semantics for append-only mode) - rewrite_atomic: include PID in temp filename for cross-process uniqueness; cleanup temp file on ANY failure (write/flush/sync/rename) - vlog recovery: replace sfa::Reader::new(path) with Fs-opened handle via sfa::Reader::from_reader to stay within Fs abstraction
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/vlog/mod.rs`:
- Around line 68-69: Replace the panic-causing assert!(!dirent.is_dir) with an
error return: check dirent.is_dir and if true return a corruption-style error
that includes the offending blob_file_path (e.g.,
Err(Error::Corruption(format!("unexpected directory in blobs: {}",
blob_file_path)))) and update the enclosing function signature to return a
Result if necessary so callers can handle the recovery instead of crashing;
preserve the original path value from blob_file_path in the error message for
diagnostics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6d3a178d-70c9-470f-bea3-eb4a4a024d77
📒 Files selected for processing (3)
src/file.rssrc/fs/mem_fs.rssrc/vlog/mod.rs
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/vlog/mod.rs`:
- Around line 63-71: The code parses file_name into a BlobFileId before checking
dirent.is_dir, which causes unrecoverable errors for non-numeric directory
names; move the early directory check so that you first inspect dirent.is_dir
(and continue) using dirent.path/dirent.is_dir, and only call
file_name.parse::<BlobFileId>() to produce blob_file_id after confirming the
entry is not a directory (preserve the existing error logging and
crate::Error::Unrecoverable handling for parse failures).
- Around line 33-34: Remove the TOCTOU pattern that calls fs.exists(folder)?
before enumerating; instead attempt to read the directory directly (use
fs.read_dir or the existing read_dir helper) on the variable folder and handle
the Err kind NotFound by returning Ok((vec![], vec![])); for other errors
propagate them. Replace the exists check in the recovery path with direct
read_dir logic so recovery enumerates atomically and treats NotFound as an empty
directory.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: caed869b-06c6-4265-864d-f3fc7cc24523
📒 Files selected for processing (2)
src/file.rssrc/vlog/mod.rs
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@coderabbitai resume and full review |
|
🧠 Learnings used✅ Actions performedReviews resumed. Full review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fs/mem_fs.rs`:
- Around line 310-320: create_dir_all currently inserts every ancestor into
state.dirs without checking state.files, allowing a path to be both a file and a
directory; fix create_dir_all by, after acquiring write_state(&self.state) and
before inserting, iterating ancestors from the top down (using the same current
PathBuf logic) and for each ancestor check state.files.contains(¤t): if
any ancestor exists as a file return an io::Error (e.g. io::Error::new with an
appropriate ErrorKind) instead of inserting; otherwise insert into state.dirs as
now—use the existing function names write_state and create_dir_all and the state
maps state.dirs and state.files to locate where to add the check.
- Around line 250-307: In MemFs::open(), add validation at the start to reject
invalid OpenOptions combos: if opts.truncate is true and neither opts.write nor
opts.append is true, return Err(io::Error::new(io::ErrorKind::InvalidInput,
"invalid truncate without write/append")); likewise if opts.create or
opts.create_new is true and neither opts.write nor opts.append is true, return
Err(io::Error::new(io::ErrorKind::InvalidInput, "invalid create without
write/append")). Place these checks before any file creation/lookup logic so
functions like open (and the branches that construct MemFile) never produce
read-only handles when the real std::fs would error.
In `@src/vlog/mod.rs`:
- Around line 36-42: The code currently returns Ok((vec![], vec![])) when
fs.read_dir(folder) yields ErrorKind::NotFound, which bypasses the later guard
checking blob_files.len() < ids.len() and wrongly succeeds if the manifest still
references blobs; instead, change the ErrorKind::NotFound branch so it does not
early-return but provides an empty iterator (or sets entries to an empty
collection) so the subsequent logic that compares blob_files.len() and ids will
run and fail appropriately when live blob IDs exist; update the analogous branch
at the other occurrence (lines referenced ~136-138) the same way, ensuring any
NotFound error is handled as an absent directory rather than an immediate
successful recovery.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e9ea3619-0145-4f1a-b6e8-15ce32f29302
📒 Files selected for processing (15)
benches/index_block.rssrc/blob_tree/ingest.rssrc/blob_tree/mod.rssrc/compaction/flavour.rssrc/file.rssrc/file_accessor.rssrc/fs/mem_fs.rssrc/table/inner.rssrc/table/mod.rssrc/table/multi_writer.rssrc/table/util.rssrc/tree/ingest.rssrc/vlog/accessor.rssrc/vlog/blob_file/multi_writer.rssrc/vlog/mod.rs
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
src/vlog/blob_file/scanner.rs (1)
204-204:⚠️ Potential issue | 🔴 CriticalSame formatting issue as lines 7-8.
This import statement is part of the same CI formatting failure. Running
cargo fmt --allwill fix both hunks.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/vlog/blob_file/scanner.rs` at line 204, The import statement in scanner.rs has the same formatting issue as the earlier hunk; run cargo fmt --all to reformat the file and adjust the problematic use ... import line(s) in src/vlog/blob_file/scanner.rs so they match rustfmt’s expected spacing/line breaks (ensure the use declarations are properly grouped and wrapped as rustfmt produces).src/vlog/mod.rs (1)
33-43:⚠️ Potential issue | 🔴 CriticalStill unresolved: a missing blobs directory can't be treated as empty when
idsis non-empty.This
NotFoundearly return skips the later missing-live-blob validation, so recovery can succeed with zero recovered blobs even though the manifest still references live blob files. The updatedvlog_recovery_missing_blob_filetest still points atPath::new("."), so it won't exercise this branch. HandleNotFoundas empty only whenids.is_empty(), and make the test use a genuinely missing path.🛠️ Proposed fix
let entries = match fs.read_dir(folder) { Ok(entries) => entries, Err(e) if e.kind() == std::io::ErrorKind::NotFound => { - return Ok((vec![], vec![])); + if ids.is_empty() { + return Ok((vec![], vec![])); + } + log::error!( + "blob folder {} is missing but {} blob file(s) are still referenced", + folder.display(), + ids.len(), + ); + return Err(crate::Error::Unrecoverable); } Err(e) => return Err(e.into()), };Update the test to point at a missing child path under a tempdir instead of
"."so it actually covers this branch.Also applies to: 157-163
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/vlog/mod.rs` around lines 33 - 43, The code currently treats a NotFound error from reading the blobs directory as empty unconditionally, which skips later missing-live-blob validation when ids is non-empty; modify the logic in src/vlog/mod.rs (the blobs-dir read/NotFound handling around the code that references ids and the missing-live-blob check) to treat NotFound-as-empty only when ids.is_empty(), otherwise propagate the error so the missing-live-blob validation runs; also update the vlog_recovery_missing_blob_file test to use a genuinely missing child path under a tempdir (not Path::new("."")) so it actually exercises the NotFound branch (same change needed for the similar block referenced at lines ~157-163).src/fs/mem_fs.rs (2)
232-248:⚠️ Potential issue | 🟠 MajorReject invalid
FsOpenOptionscombinations before mutating state.
MemFs::open()currently acceptstruncate(true)without write/append,create/create_newwithout write/append, and even a handle with no access mode set. Because Line 275 clears the buffer duringopen(), MemFs-backed tests can silently erase data with flag sets that the real filesystem backend rejects asInvalidInput.🔧 Suggested guardrails
fn open(&self, path: &Path, opts: &FsOpenOptions) -> io::Result<Box<dyn FsFile>> { + let wants_access = opts.read || opts.write || opts.append; + let wants_write = opts.write || opts.append; + + if !wants_access { + return Err(io::Error::new( + io::ErrorKind::InvalidInput, + "at least one access mode must be set", + )); + } + + if opts.truncate && !wants_write { + return Err(io::Error::new( + io::ErrorKind::InvalidInput, + "truncate requires write or append access", + )); + } + + if (opts.create || opts.create_new) && !wants_write { + return Err(io::Error::new( + io::ErrorKind::InvalidInput, + "create/create_new requires write or append access", + )); + } + let mut state = write_state(&self.state)?; let path = path.to_path_buf();Also applies to: 275-277
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fs/mem_fs.rs` around lines 232 - 248, MemFs::open currently mutates state (clears buffer) before validating FsOpenOptions and thus accepts invalid flag combos; add an upfront validation in MemFs::open to reject InvalidInput for these cases before any mutation: check that at least one access mode (read/write/append) is set, disallow truncate(true) when neither write nor append is set, and disallow create/create_new when neither write nor append is set; return an io::Error::new(ErrorKind::InvalidInput, ...) immediately on invalid combos so the subsequent buffer-clearing logic (see buffer.clear() around the current open implementation) only runs for valid options.
310-318:⚠️ Potential issue | 🟠 Major
create_dir_all()must fail when any component is already a file.The loop at Line 314 inserts every component into
state.dirsunconditionally.create_dir_all("/a/b")therefore succeeds even if/aor/a/balready exists instate.files, leaving the same path tracked as both file and directory.🛑 Suggested fix
fn create_dir_all(&self, path: &Path) -> io::Result<()> { let mut state = write_state(&self.state)?; let mut current = path.to_path_buf(); loop { + if state.files.contains_key(¤t) { + return Err(io::Error::new( + io::ErrorKind::InvalidInput, + format!("path exists as a file: {}", current.display()), + )); + } + state.dirs.insert(current.clone()); if !current.pop() || current.as_os_str().is_empty() { break; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fs/mem_fs.rs` around lines 310 - 318, The create_dir_all implementation currently inserts each path component into state.dirs without checking state.files, allowing paths to be both files and directories; update the create_dir_all function to, for each component (the loop that inserts into state.dirs), first check state.files for that exact path and return an appropriate error if any component exists as a file, otherwise proceed to create the directory, and ensure you hold the same lock semantics (e.g., the same state mutex) while performing the checks and inserts to avoid races; reference the state.dirs/state.files maps and the create_dir_all function to locate where to add this guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fs/mem_fs.rs`:
- Around line 248-265: open() currently checks only state.files before creating
files, so create(true)/create_new(true) can accidentally create a file over an
existing directory tracked in state.dirs; update the open() implementation (the
branches that honor create/create_new) to first check state.dirs for the target
path and refuse creation (return an appropriate error like
std::io::ErrorKind::AlreadyExists or equivalent used by this mem fs) if the path
exists in state.dirs, ensuring you do this for both creation branches referenced
around the create(true)/create_new(true) logic; keep references to state.files,
state.dirs, metadata(), and read_dir() in mind so the namespace remains
consistent.
In `@src/vlog/blob_file/writer.rs`:
- Around line 9-15: The import/use block at the top of writer.rs has formatting
drift causing cargo fmt --all -- --check to fail; run cargo fmt --all (or
rustfmt) and commit the formatted file so the use/import block is normalized, or
manually reformat the use statements in the top-level import block (the "use
..." declarations in writer.rs) to match rustfmt's grouping/spacing style before
committing.
In `@src/vlog/mod.rs`:
- Around line 10-23: The import/re-export block in src/vlog/mod.rs is
misformatted and failing rustfmt; run rustfmt by executing `cargo fmt` (or
`cargo fmt --all`) to reformat the `use` and `pub use` declarations in that
module without changing semantics, then stage and commit the resulting changes
so CI passes the `cargo fmt --all -- --check` step.
---
Duplicate comments:
In `@src/fs/mem_fs.rs`:
- Around line 232-248: MemFs::open currently mutates state (clears buffer)
before validating FsOpenOptions and thus accepts invalid flag combos; add an
upfront validation in MemFs::open to reject InvalidInput for these cases before
any mutation: check that at least one access mode (read/write/append) is set,
disallow truncate(true) when neither write nor append is set, and disallow
create/create_new when neither write nor append is set; return an
io::Error::new(ErrorKind::InvalidInput, ...) immediately on invalid combos so
the subsequent buffer-clearing logic (see buffer.clear() around the current open
implementation) only runs for valid options.
- Around line 310-318: The create_dir_all implementation currently inserts each
path component into state.dirs without checking state.files, allowing paths to
be both files and directories; update the create_dir_all function to, for each
component (the loop that inserts into state.dirs), first check state.files for
that exact path and return an appropriate error if any component exists as a
file, otherwise proceed to create the directory, and ensure you hold the same
lock semantics (e.g., the same state mutex) while performing the checks and
inserts to avoid races; reference the state.dirs/state.files maps and the
create_dir_all function to locate where to add this guard.
In `@src/vlog/blob_file/scanner.rs`:
- Line 204: The import statement in scanner.rs has the same formatting issue as
the earlier hunk; run cargo fmt --all to reformat the file and adjust the
problematic use ... import line(s) in src/vlog/blob_file/scanner.rs so they
match rustfmt’s expected spacing/line breaks (ensure the use declarations are
properly grouped and wrapped as rustfmt produces).
In `@src/vlog/mod.rs`:
- Around line 33-43: The code currently treats a NotFound error from reading the
blobs directory as empty unconditionally, which skips later missing-live-blob
validation when ids is non-empty; modify the logic in src/vlog/mod.rs (the
blobs-dir read/NotFound handling around the code that references ids and the
missing-live-blob check) to treat NotFound-as-empty only when ids.is_empty(),
otherwise propagate the error so the missing-live-blob validation runs; also
update the vlog_recovery_missing_blob_file test to use a genuinely missing child
path under a tempdir (not Path::new("."")) so it actually exercises the NotFound
branch (same change needed for the similar block referenced at lines ~157-163).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e9ea3619-0145-4f1a-b6e8-15ce32f29302
📒 Files selected for processing (26)
benches/index_block.rssrc/blob_tree/ingest.rssrc/blob_tree/mod.rssrc/compaction/flavour.rssrc/config/mod.rssrc/file.rssrc/file_accessor.rssrc/fs/mem_fs.rssrc/fs/mod.rssrc/table/inner.rssrc/table/mod.rssrc/table/multi_writer.rssrc/table/tests.rssrc/table/util.rssrc/tree/ingest.rssrc/tree/mod.rssrc/vlog/accessor.rssrc/vlog/blob_file/merge.rssrc/vlog/blob_file/meta.rssrc/vlog/blob_file/mod.rssrc/vlog/blob_file/multi_writer.rssrc/vlog/blob_file/reader.rssrc/vlog/blob_file/scanner.rssrc/vlog/blob_file/writer.rssrc/vlog/mod.rstests/mem_fs_tree.rs
…flicts - open() rejects creating files at paths that are directories - create_dir_all() fails if an ancestor is already a regular file - vlog recovery: pre-populate DescriptorTable FD cache after initial open
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
src/vlog/mod.rs (1)
9-23:⚠️ Potential issue | 🟡 MinorRun
cargo fmton this touched import block.
cargo fmt --all -- --checkis still failing on this hunk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/vlog/mod.rs` around lines 9 - 23, The touched import block around the pub use and use crate::{...} declarations in src/vlog/mod.rs is not formatted; run rustfmt/cargo fmt to fix formatting (e.g., run cargo fmt --all) and ensure the pub use and the use crate::{FileAccessor, Fs, vlog::blob_file::{Inner as BlobFileInner, Metadata}, Checksum, DescriptorTable, TreeId} lines are wrapped/indented per rustfmt rules so that cargo fmt --all -- --check passes.src/fs/mem_fs.rs (1)
232-317:⚠️ Potential issue | 🟠 MajorReject invalid
OpenOptionscombinations before mutating state.
truncate(true)without write/append still clears the file at Line 285, andcreate/create_newwithout write/append still create a read-only handle at Lines 267 and 303.std::fs::OpenOptionsrejects both asInvalidInput, so MemFs can mutate state in cases the real backend would fail before touching the filesystem.🛠️ Suggested fix
fn open(&self, path: &Path, opts: &FsOpenOptions) -> io::Result<Box<dyn FsFile>> { let mut state = write_state(&self.state)?; let path = path.to_path_buf(); + let wants_write = opts.write || opts.append; + + if opts.truncate && !wants_write { + return Err(io::Error::new( + io::ErrorKind::InvalidInput, + "truncate requires write or append access", + )); + } + + if (opts.create || opts.create_new) && !wants_write { + return Err(io::Error::new( + io::ErrorKind::InvalidInput, + "create/create_new requires write or append access", + )); + } // Verify parent directory exists (mirrors std::fs behaviour). if let Some(parent) = path.parent()Does Rust's `std::fs::OpenOptions` reject `truncate(true)` without write/append access, and `create(true)` / `create_new(true)` without write/append access?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fs/mem_fs.rs` around lines 232 - 317, In open(), validate FsOpenOptions combinations up front and return io::ErrorKind::InvalidInput for invalid combos (e.g. opts.truncate == true without opts.write || opts.append, and opts.create || opts.create_new true without opts.write || opts.append) before acquiring write_state or mutating state; reference the open() function and the opts fields (truncate, create, create_new, write, append, read) and ensure you perform these checks before any state.files.insert or lock(&data)?.clear() calls so the function mirrors std::fs::OpenOptions behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/file.rs`:
- Around line 50-86: The helper rewrite_atomic assumes Fs::rename will
atomically replace an existing file in the same directory, but the Fs trait
currently documents rename as “rename or error” and custom backends may not
overwrite; to fix, either tighten Fs trait docs and tests to require and assert
atomic same-directory replace semantics (update Fs trait documentation and add
tests exercising overwrite behavior) or change rewrite_atomic to tolerate
non-overwriting renames by implementing a safe fallback: attempt
fs.rename(&tmp_path, path) and if it errors with EEXIST or a rename-not-allowed
condition, perform an explicit atomic-safe replace (e.g., remove destination
then rename or use a platform-specific replace API) or return a clear error
explaining the stronger requirement; reference rewrite_atomic, Fs::rename, Fs
trait, and Config::with_fs to locate the spots to update.
In `@src/fs/mem_fs.rs`:
- Around line 350-370: The code currently lossy-converts directory entries with
name.to_string_lossy(); instead preserve raw names by using OsString: change
FsDirEntry.file_name to an OsString (update its definition in src/fs/mod.rs to
match StdFs semantics) and construct entries with file_name: name.to_os_string()
inside the loops over state.files.keys() and state.dirs (update any call
sites/serialisation to handle the new OsString type).
- Around line 319-334: The create_dir_all implementation mutates state.dirs as
it walks from leaf-to-root, which can leave stale entries if a later ancestor is
a file; instead, in create_dir_all (use the function name), first validate the
entire ancestor chain against state.files (read via write_state(&self.state) or
perform a preliminary check before any inserts) to ensure no ancestor is a
regular file, and only after validation perform the inserts into state.dirs for
the collected path components; in short: gather/validate the path chain first
(no mutations), then insert into state.dirs atomically to avoid partial creation
on error.
In `@src/vlog/mod.rs`:
- Around line 159-167: The test vlog_recovery_missing_blob_file should not rely
on the current working directory; replace Path::new(".") with a real temporary
directory created via tempfile::tempdir() and pass tempdir.path() into
recover_blob_files so the assertion checks for an actual missing blob file,
e.g., create a tempfile::TempDir, use its path as the first argument to
recover_blob_files, keep the same checksum/vector arguments and Fs
(Arc::new(crate::fs::StdFs)), and let tempdir drop at test end to clean up.
---
Duplicate comments:
In `@src/fs/mem_fs.rs`:
- Around line 232-317: In open(), validate FsOpenOptions combinations up front
and return io::ErrorKind::InvalidInput for invalid combos (e.g. opts.truncate ==
true without opts.write || opts.append, and opts.create || opts.create_new true
without opts.write || opts.append) before acquiring write_state or mutating
state; reference the open() function and the opts fields (truncate, create,
create_new, write, append, read) and ensure you perform these checks before any
state.files.insert or lock(&data)?.clear() calls so the function mirrors
std::fs::OpenOptions behavior.
In `@src/vlog/mod.rs`:
- Around line 9-23: The touched import block around the pub use and use
crate::{...} declarations in src/vlog/mod.rs is not formatted; run rustfmt/cargo
fmt to fix formatting (e.g., run cargo fmt --all) and ensure the pub use and the
use crate::{FileAccessor, Fs, vlog::blob_file::{Inner as BlobFileInner,
Metadata}, Checksum, DescriptorTable, TreeId} lines are wrapped/indented per
rustfmt rules so that cargo fmt --all -- --check passes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6b411feb-f55c-4e44-906d-472e341aa3d9
📒 Files selected for processing (4)
rust-toolchain.tomlsrc/file.rssrc/fs/mem_fs.rssrc/vlog/mod.rs
…nd read_dir - Document atomic-replace semantics on Fs::rename trait method - MemFs::create_dir_all validates ancestors before inserting (no partial state on error) - MemFs::read_dir rejects non-UTF-8 filenames with InvalidData (matches StdFs) - Replace cwd-dependent vlog recovery test with isolated tempdir tests
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/fs/mem_fs.rs (1)
248-310:⚠️ Potential issue | 🟠 MajorReject mutating open modes on non-writable handles.
These branches still mutate
MemFseven when the returned handle haswritable: false: Line 285 clears an existing file, and Lines 266-267 / 302-303 can create a new one. That makes MemFs-backed tests accept option combinations that cannot actually perform writes.🛠️ Suggested guard
let exists = state.files.contains_key(&path); let is_dir = state.dirs.contains(&path); + let wants_write = opts.write || opts.append; + + if opts.truncate && !wants_write { + return Err(io::Error::new( + io::ErrorKind::InvalidInput, + "truncate requires write or append access", + )); + } + + if (opts.create || opts.create_new) && !wants_write { + return Err(io::Error::new( + io::ErrorKind::InvalidInput, + "create/create_new requires write or append access", + )); + } // Reject creating a file at a path that is already a directory. if is_dir && (opts.create || opts.create_new) {Rust std::fs::OpenOptions docs invalid combinations truncate without write access create/create_new without write or append access InvalidInput🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fs/mem_fs.rs` around lines 248 - 310, The code currently mutates state.files (creating files or truncating) even when opts does not grant write or append rights; add guards that reject such invalid combinations with io::ErrorKind::InvalidInput before any mutation: ensure that create and create_new paths only proceed when opts.write || opts.append is true (return Err(io::Error::new(io::ErrorKind::InvalidInput, "...")) otherwise), and ensure truncate only runs when opts.write is true (reject with InvalidInput if not); apply these checks before inserting into state.files or calling lock(&data)?.clear() so MemFile and the MemFs state remain consistent (refer to opts, state.files, MemFile and the truncate/create/create_new branches).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fs/mem_fs.rs`:
- Around line 80-85: MemFs currently fakes exclusive locking (lock_exclusive()
returning Ok(())) and MemFile has no shared lock state, so multiple MemFs clones
can open the same path concurrently; implement a real per-path exclusive lock:
add a lock store (e.g., HashMap<PathBuf, LockEntry> inside the shared state used
by MemFs::clone) and update lock_exclusive()/unlock_exclusive() to
acquire/release an entry (blocking or returning Err if already held), then
attach lock ownership to MemFile instances (store a token/Arc to the LockEntry
in MemFile) so dropping/closing the file frees the lock and prevents a second
opener for the same path; update MemFs::clone(), MemFs::open()/open_dir() (or
the methods that call lock_exclusive()) and the MemFile constructor to
use/propagate that lock token.
---
Duplicate comments:
In `@src/fs/mem_fs.rs`:
- Around line 248-310: The code currently mutates state.files (creating files or
truncating) even when opts does not grant write or append rights; add guards
that reject such invalid combinations with io::ErrorKind::InvalidInput before
any mutation: ensure that create and create_new paths only proceed when
opts.write || opts.append is true (return
Err(io::Error::new(io::ErrorKind::InvalidInput, "...")) otherwise), and ensure
truncate only runs when opts.write is true (reject with InvalidInput if not);
apply these checks before inserting into state.files or calling
lock(&data)?.clear() so MemFile and the MemFs state remain consistent (refer to
opts, state.files, MemFile and the truncate/create/create_new branches).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 89191909-0998-40fd-8c8b-c59eb2cd83a2
📒 Files selected for processing (5)
src/fs/mem_fs.rssrc/fs/mod.rssrc/vlog/accessor.rssrc/vlog/blob_file/multi_writer.rssrc/vlog/mod.rs
|
@coderabbitai Re: duplicate comment (mem_fs.rs:248-310) — Fixed. MemFs::open now rejects truncate/create without write or append access, matching std::fs::OpenOptions validation. |
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings used |
|
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.
| Benchmark suite | Current: 5c629a3 | Previous: 7d1349e | Ratio |
|---|---|---|---|
fillseq |
895388.3552135556 ops/sec (normalized) |
1153151.9210478074 ops/sec (normalized) |
1.29 |
fillrandom |
483696.4925240431 ops/sec (normalized) |
653001.1792087426 ops/sec (normalized) |
1.35 |
readrandom |
243830.86207966247 ops/sec (normalized) |
306750.82456667925 ops/sec (normalized) |
1.26 |
overwrite |
480900.162289207 ops/sec (normalized) |
667980.8595552227 ops/sec (normalized) |
1.39 |
readwhilewriting |
211065.96703931715 ops/sec (normalized) |
257154.28586125583 ops/sec (normalized) |
1.22 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz



Summary
MemFs+MemFile—HashMap<PathBuf, Arc<Mutex<Vec<u8>>>>-backed virtual filesystem implementing theFstraitConfig::with_fs()builder method for plugging alternative Fs backendsConfig<F: Fs>→ConfigwithArc<dyn Fs>(breaking forConfig<CustomFs>users — none exist in practice since the generic was introduced in this fork)rewrite_atomicto useFstrait instead oftempfilecrate (fixes refactor(fs): rewrite_atomic — replace tempfile crate with Fs-trait atomic write #188), with best-effort temp file cleanup on failureArc<dyn Fs>throughTable::recover,FileAccessor,load_block, and vlog paths so all file opens go through the pluggable FsFsto allTable::recovercall sites (recovery uses the same backend that wrote the table)table_file_opened_cached/uncachedmetrics via(fd, cache_hit)return fromFileAccessorstd::fs::read_dir/try_existsin vlog recovery withFstrait callsDesign Notes
NotFound→ empty:recover_blob_filesreturnsOk([], [])when the blobs folder doesn't exist. This is pre-existing behavior (wastry_existsbefore this PR) and is correct: standard (non-blob) trees don't have a blobs folder, and callers handle missing blob files via the orphan detection path downstream. See code comment atsrc/vlog/mod.rs:34.Known Limitations
Tree::openchecks forCURRENTviaPath::try_exists()(bypasses Fs). New trees work; reopening in-memory trees is not supportedSuperVersions::gcusesstd::fsdirectly — stale entries accumulate until MemFs is droppedFstrait. Write + flush + point-read works; compaction may fail with ENOENT on virtual pathsTest Plan
Closes #187
Closes #188
Summary by CodeRabbit
New Features
Improvements
Tests