feat(fs): MemFs — in-memory Fs implementation for testing and in-memory trees#208
feat(fs): MemFs — in-memory Fs implementation for testing and in-memory trees#208polaz wants to merge 26 commits into
Conversation
…paths Implement MemFs — a HashMap-backed in-memory Fs implementation for testing and ephemeral trees. Wire Arc<dyn Fs> through the entire I/O stack so that all file operations go through the pluggable Fs trait. MemFs implementation: - MemFile with Read/Write/Seek/read_at backed by Arc<Mutex<Vec<u8>>> - Full Fs trait: open, create_dir_all, read_dir, remove_file, remove_dir_all, rename, metadata, sync_directory, exists - Validates parent dirs, rejects dir-file conflicts, mirrors std::fs semantics (InvalidInput for truncate/create without write access) - 20 unit tests + 5 integration tests (open tree, flush, read, range) Config changes: - De-generify Config<F: Fs> to Config with Arc<dyn Fs> - Add Config::with_fs() builder for alternative backends Fs piping (all file opens now go through Fs): - rewrite_atomic: Fs-native temp write with PID+seq naming and best-effort cleanup on all failure paths (replaces tempfile crate) - Table::recover accepts Arc<dyn Fs>, opens files through Fs - FileAccessor::get_or_open_table returns (fd, cache_hit) for metrics - Wire correct level-routed Fs to all 6 Table::recover call sites - vlog recovery: Fs-based read_dir, exists, sfa::Reader::from_reader - Blob recovery: pre-populate DescriptorTable FD cache, skip dirs before parse, eliminate TOCTOU with direct read_dir - Document Fs::rename atomic-replace semantics Closes #187 Closes #188
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 2 minutes and 23 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThreads an Arc filesystem abstraction through config, writers, recoveries, accessors, ingestion, compaction, and vlog; adds an in-memory MemFs backend and tests; replaces tempfile-based atomic rewrite with an Fs-backed implementation; updates Table::recover and many call sites to accept and use an explicit Fs handle. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Table as Table::recover
participant FileAccessor
participant Descriptor as DescriptorTable
participant Fs as Arc("dyn Fs")
Caller->>Table: recover(table_id, ..., fs: Arc<dyn Fs>)
Table->>FileAccessor: get_or_open_table(table_id, path)
FileAccessor->>Descriptor: check descriptor cache
alt cache miss
FileAccessor->>Fs: open(path, read)
Fs-->>FileAccessor: FsFile handle
FileAccessor->>Descriptor: insert { table, fs: fs.clone() }
end
FileAccessor-->>Table: (FsFile, Option<bool>)
Table->>Fs: read via FsFile (ParsedMeta::load_with_handle(&*file))
Table-->>Caller: constructed Table (Inner holds Arc<dyn Fs>)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
This PR completes the “pluggable Fs backend” story by introducing an in-memory filesystem (MemFs) and refactoring core I/O paths to route through Arc<dyn Fs> (including crash-safety critical rewrite_atomic), enabling fully in-memory trees for tests and ephemeral usage.
Changes:
- Add
MemFs/MemFilein-memoryFsimplementation and a new integration test suite exercisingTreeoperations on MemFs. - De-generify
Config<F: Fs>intoConfigwithArc<dyn Fs>and addConfig::with_fs()to plug alternative backends. - Thread
Fsthrough recovery/open paths (tables + vlog), unify file-open caching viaFileAccessor::{get_or_open_table,get_or_open_blob_file}, and update metrics counting.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/mem_fs_tree.rs | New integration tests for opening/using a Tree on MemFs. |
| src/vlog/mod.rs | Route blob recovery directory scan + opens through Fs; improve handling of missing folder/dirs. |
| src/vlog/blob_file/multi_writer.rs | Open blob files via Fs and pass Arc<dyn Fs> through writer finalization. |
| src/vlog/accessor.rs | Use FileAccessor::get_or_open_blob_file instead of std::fs + manual cache insert. |
| src/tree/mod.rs | Propagate level-routed Fs into table recovery and vlog recovery. |
| src/tree/ingest.rs | Store/propagate level-routed Fs through ingestion path. |
| src/table/util.rs | Switch block loading to get_or_open_table and restore cached/uncached metrics. |
| src/table/tests.rs | Update table test constructors/matches for new Fs plumbing and FileAccessor shape. |
| src/table/multi_writer.rs | Expose fs to crate for compaction finalization to reuse the correct backend. |
| src/table/mod.rs | Open table files via Fs in recovery; store Fs in Table/Inner. |
| src/table/inner.rs | Delete table files in Drop via Fs instead of std::fs. |
| src/fs/mod.rs | Export MemFs and document required atomic-replace semantics for Fs::rename. |
| src/fs/mem_fs.rs | New MemFs implementation + unit tests. |
| src/file.rs | Rewrite rewrite_atomic to avoid tempfile and go through Fs for crash-safe CURRENT updates. |
| src/file_accessor.rs | Extend FileAccessor to hold {table, fs} and open via Fs on cache miss. |
| src/config/mod.rs | Convert Config to store Arc<dyn Fs> and add Config::with_fs(). |
| src/compaction/flavour.rs | Ensure compaction table recovery uses the writer’s Fs. |
| src/blob_tree/mod.rs | Propagate level-routed Fs into blob tree table recovery paths. |
| src/blob_tree/ingest.rs | Pass level-routed Fs into blob ingestion recovery path. |
| rust-toolchain.toml | Bump toolchain channel to 1.94.1. |
| benches/index_block.rs | Update bench table construction to pass StdFs. |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/tree/mod.rs (1)
1495-1495:⚠️ Potential issue | 🟠 MajorMemFs reopen still cannot recover persisted state.
Line 1495 still enters
version::recovery::recover(tree_path), and Line 1691 still reachescleanup_orphaned_version(...); both paths use the real filesystem. This file also still chooses betweenrecover()andcreate_new()viaPath::try_exists(). So after a flush/version rewrite, reopening the same virtual path withMemFsstill recreates a fresh tree or fails instead of recovering the existing state.Also applies to: 1679-1691
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tree/mod.rs` at line 1495, The reopen logic currently calls version::recovery::recover(tree_path), Path::try_exists(), and cleanup_orphaned_version(...) which operate on the real filesystem; change these to use the virtual filesystem/MemFs-aware APIs so MemFs can recover persisted state. Specifically, replace direct calls to recover(...) and create_new(...) that take a std Path with the VFS-aware variants (or add an overload) that accept the TreePath or its Vfs handle, replace Path::try_exists() with the Vfs/TreePath existence check, and ensure cleanup_orphaned_version is invoked with Vfs-aware paths; update the code paths around the recover/create_new decision (the block containing recover(tree_path) at line 1495 and the create_new/recovery branch around 1679-1691) to use those VFS-aware functions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust-toolchain.toml`:
- Line 2: The PR updates rust-toolchain.toml's channel value to "1.94.1" without
documenting the security rationale; update the PR and commit message to
explicitly state that the bump includes security fixes (CVE-2026-33055,
CVE-2026-33056 and regression fixes) or split this change into a dedicated
maintenance/security PR. Refer to the rust-toolchain.toml channel change when
editing the PR description and add a short entry in the changelog or release
notes referencing the security CVEs to make the rationale discoverable in git
history.
In `@src/file_accessor.rs`:
- Around line 42-56: get_or_open_table currently returns a bool that conflates
three distinct states (pinned/no-descriptor-table, descriptor-cache hit, and
newly opened file), causing Self::File to be misreported as a cache hit; change
the bool to a small enum (e.g., Pinned | Cached | Opened) and update
get_or_open_table to return that enum: for Self::File return Pinned, when
table.access_for_table(table_id) succeeds return Cached, and when you open via
fs.open(...) and call table.insert_for_table(...) return Opened; then update
callers (e.g., the code that treats the returned boolean as
table_file_opened_cached) to branch on the new enum values.
In `@src/fs/mem_fs.rs`:
- Around line 455-497: Add a unit test to verify MemFs::rename() atomically
replaces an existing destination file: create a test (e.g.,
rename_replaces_existing_destination) in the test module that uses MemFs::new(),
creates /dir, writes "old content" to /dir/dst.txt, writes "new content" to
/dir/src.txt, calls fs.rename(src, dst), asserts src no longer exists via
MemFs::exists, and reads dst to assert it contains "new content"; this confirms
the HashMap::insert overwrite behavior in rename() and safeguards rewrite_atomic
semantics.
---
Outside diff comments:
In `@src/tree/mod.rs`:
- Line 1495: The reopen logic currently calls
version::recovery::recover(tree_path), Path::try_exists(), and
cleanup_orphaned_version(...) which operate on the real filesystem; change these
to use the virtual filesystem/MemFs-aware APIs so MemFs can recover persisted
state. Specifically, replace direct calls to recover(...) and create_new(...)
that take a std Path with the VFS-aware variants (or add an overload) that
accept the TreePath or its Vfs handle, replace Path::try_exists() with the
Vfs/TreePath existence check, and ensure cleanup_orphaned_version is invoked
with Vfs-aware paths; update the code paths around the recover/create_new
decision (the block containing recover(tree_path) at line 1495 and the
create_new/recovery branch around 1679-1691) to use those VFS-aware functions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a748c4a3-27e4-4b3b-95a4-e2486896bbd0
📒 Files selected for processing (21)
benches/index_block.rsrust-toolchain.tomlsrc/blob_tree/ingest.rssrc/blob_tree/mod.rssrc/compaction/flavour.rssrc/config/mod.rssrc/file.rssrc/file_accessor.rssrc/fs/mem_fs.rssrc/fs/mod.rssrc/table/inner.rssrc/table/mod.rssrc/table/multi_writer.rssrc/table/tests.rssrc/table/util.rssrc/tree/ingest.rssrc/tree/mod.rssrc/vlog/accessor.rssrc/vlog/blob_file/multi_writer.rssrc/vlog/mod.rstests/mem_fs_tree.rs
…ct cache metrics - rewrite_atomic retries with next seq on AlreadyExists (handles crash leftovers) - MemFs::open rejects no-flag opens with InvalidInput (matches std::fs) - FileAccessor: pinned FDs not counted as descriptor-cache hits - Add rename-replace atomicity test - Track Tree::open Fs migration in #209 with inline code comment
|
@coderabbitai Re: outside-diff src/tree/mod.rs:1495 (MemFs reopen) — This is a documented known limitation. Created #209 to track full Tree::open Fs migration. Added inline code comment at the bypass point. |
|
Tip For best results, initiate chat on the files or code changes.
✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.
| Benchmark suite | Current: 093765c | Previous: 7d1349e | Ratio |
|---|---|---|---|
fillseq |
886976.5448707001 ops/sec (normalized) |
1153151.9210478074 ops/sec (normalized) |
1.30 |
fillrandom |
458644.19596829236 ops/sec (normalized) |
653001.1792087426 ops/sec (normalized) |
1.42 |
readrandom |
232890.43593753094 ops/sec (normalized) |
306750.82456667925 ops/sec (normalized) |
1.32 |
overwrite |
484863.1751445867 ops/sec (normalized) |
667980.8595552227 ops/sec (normalized) |
1.38 |
readwhilewriting |
200053.23668976434 ops/sec (normalized) |
257154.28586125583 ops/sec (normalized) |
1.29 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/file_accessor.rs`:
- Around line 38-41: Update the doc comments and debug label to state that the
returned bool indicates a cache hit (true) or a cache miss (false), and not to
assume false means the descriptor was freshly opened; adjust the comment on the
method that "Returns a cached table FD or opens the file via [`Fs`] on cache
miss" to clarify "false = cache miss (may be opened now or provided by
Self::File), not necessarily freshly opened", and change the
`FileAccessor::Cached` debug label and any other comments around `Self::File`
and the other documented spots (the other docblocks at the indicated ranges) to
use "cache miss" wording instead of "freshly opened".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: aa8729b2-f547-4404-8a09-0d09c7f16fd8
📒 Files selected for processing (4)
src/file.rssrc/file_accessor.rssrc/fs/mem_fs.rssrc/tree/mod.rs
- FileAccessor::get_or_open returns Option<bool>: None for pinned FDs (no cache event), Some(true) for hit, Some(false) for miss - Metrics in load_block only fire on descriptor-table cache events - rewrite_atomic cleans up temp file on write/flush/sync failure - Add Config::with_shared_fs(Arc<dyn Fs>) for sharing backends
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/mem_fs_tree.rs`:
- Around line 131-148: The test memfs_shared_across_trees currently calls
Config::with_fs with a MemFs instance (using MemFs::clone()), which only
exercises the non-shared path; change the test to use Config::with_shared_fs so
the shared Arc<dyn Fs> path is exercised. Concretely, construct or wrap the
MemFs into the shared form expected by Config::with_shared_fs (e.g., an Arc<dyn
Fs> or the type accepted by with_shared_fs) and call Config::with_shared_fs(...)
for both tree1 and tree2 instead of Config::with_fs(...), keeping the same
assertions that verify namespace isolation.
- Around line 73-86: The iterator code currently calls
guard.into_inner().unwrap() and String::from_utf8(...).unwrap(), introducing
panics; instead keep the raw byte vectors and avoid UTF-8 conversion—map each
guard to let (k, v) = guard.into_inner()? and collect into Vec<(Vec<u8>,
Vec<u8>)> (or use k.to_vec()/v.to_vec() without unwrap) and then assert_eq!
against byte vectors like b"a".to_vec()/b"1".to_vec(); update the variables used
(tree.iter, SeqNo::MAX, guard.into_inner(), items, and the assert_eq! target)
accordingly so no unwraps or UTF-8 parsing remain.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 011bb592-42b4-4c84-972e-a55f42b1ae96
📒 Files selected for processing (21)
benches/index_block.rsrust-toolchain.tomlsrc/blob_tree/ingest.rssrc/blob_tree/mod.rssrc/compaction/flavour.rssrc/config/mod.rssrc/file.rssrc/file_accessor.rssrc/fs/mem_fs.rssrc/fs/mod.rssrc/table/inner.rssrc/table/mod.rssrc/table/multi_writer.rssrc/table/tests.rssrc/table/util.rssrc/tree/ingest.rssrc/tree/mod.rssrc/vlog/accessor.rssrc/vlog/blob_file/multi_writer.rssrc/vlog/mod.rstests/mem_fs_tree.rs
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings used |
… blob files recover_blob_files now returns Unrecoverable when the blobs folder does not exist but ids is non-empty (manifest references blob files). Only treat NotFound as empty when ids is empty (standard non-blob trees).
|
@coderabbitai resume and full review |
|
🧠 Learnings used✅ Actions performedReviews resumed. Full review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/vlog/mod.rs`:
- Around line 114-124: The code currently calls dt.insert_for_blob_file
immediately (in the block that constructs FileAccessor::DescriptorTable), which
publishes file descriptors before recovery completes; instead, remove the
immediate dt.insert_for_blob_file call and defer inserting into the descriptor
table until the recovery/commit path successfully finishes (i.e., after the
open/parse and missing-file guard succeed). Locate the construction around
descriptor_table / dt and FileAccessor::DescriptorTable and change it to create
the FileAccessor without mutating dt, then perform
dt.insert_for_blob_file(global_id, Arc::from(file)) only from the successful
recovery commit/finish code path (or via an explicit commit callback) so failed
opens do not leak handles.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 845e4024-68d9-4cf3-bce8-e6046637dd6e
📒 Files selected for processing (1)
src/vlog/mod.rs
- Inner::Drop: evict descriptor-table entry before fs.remove_file so Windows doesn't fail on open-file removal - recover_blob_files: defer insert_for_blob_file until all blobs parse successfully, preventing partial FD leaks on recovery failure
|
@coderabbitai Re: #54 (deferred blob cache inserts) — Fixed. Also fixed Copilot #52/#53: |
|
Tip For best results, initiate chat on the files or code changes.
#54 (deferred blob cache inserts): Deferring #52/#53 ( ✏️ Learnings added
🧠 Learnings used |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
MemFile::read, write, set_len, and read_at used unchecked `as usize` casts on u64 cursor/offset/size values. On 32-bit targets these silently truncate, corrupting file state. Now use usize::try_from with InvalidInput errors. Removed all cast_possible_truncation expects.
Inner::Drop now moves the FileAccessor out via std::mem::replace (with a Closed sentinel) and drops it before calling fs.remove_file. This ensures both pinned File handles and descriptor-table cached FDs are released before deletion, fixing remove_file failures on Windows where open handles prevent file removal.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fs/mem_fs.rs`:
- Around line 311-323: The code currently rejects append+truncate
unconditionally; change the checks so truncate+append is only an error when
create_new is not set. Specifically, update the first check around opts.truncate
&& opts.append to allow the combination when opts.create_new is true (e.g., only
return the InvalidInput error if opts.truncate && opts.append &&
!opts.create_new). Likewise, update the truncate vs write check to not fail when
opts.create_new is set (i.e., only error if opts.truncate && !opts.write &&
!opts.create_new) so create_new+append+truncate follows std::fs::OpenOptions
behavior; locate and modify these checks around the opts variable in the
function that handles FsOpenOptions/wants_write in mem_fs.rs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8f7d89a8-3d8b-4e26-aee9-1676d695eb4c
📒 Files selected for processing (1)
src/fs/mem_fs.rs
| if opts.truncate && opts.append { | ||
| return Err(io::Error::new( | ||
| io::ErrorKind::InvalidInput, | ||
| "truncate and append cannot be used together", | ||
| )); | ||
| } | ||
| if opts.truncate && !opts.write { | ||
| return Err(io::Error::new( | ||
| io::ErrorKind::InvalidInput, | ||
| "truncate requires write access", | ||
| )); | ||
| } | ||
| if (opts.create || opts.create_new) && !wants_write { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
head -330 src/fs/mem_fs.rs | tail -40Repository: structured-world/coordinode-lsm-tree
Length of output: 1642
🌐 Web query:
std::fs::OpenOptions create_new truncate append combination behavior
💡 Result:
When using std::fs::OpenOptions with create_new(true), the create and truncate options are ignored, regardless of whether append(true) is also set. If the file does not exist: - The file is created as new (empty). - The file is opened successfully with the specified access modes (must include write or append). - If append(true) is set, all writes will append to the end of the file (which starts empty). - truncate(true) has no effect since it's ignored and the file is new anyway. If the file already exists: - open fails with io::ErrorKind::AlreadyExists. This combination is explicitly allowed (no InvalidInput error). Note that read-append with create_new is permitted per the design in RFC 1252. Example that succeeds (file created empty, writes append): use std::fs::OpenOptions; let mut opts = OpenOptions::new; opts.write(true).append(true).create_new(true).truncate(true).open("foo.txt")?; Since create_new succeeds only for new files, truncate is irrelevant, and append ensures writes go to EOF. Official Rust documentation confirms: "If .create_new(true) is set, .create and .truncate are ignored." Sources also list allowed combinations including create_new with append.
Citations:
- 1: https://doc.rust-lang.org/std/fs/struct.OpenOptions.html
- 2: https://static.rust-lang.org/doc/master/std/fs/struct.OpenOptions.html
- 3: https://doc.rust-lang.org/stable/std/fs/struct.OpenOptions.html
- 4: https://rust-lang.github.io/rfcs/1252-open-options.html
Allow create_new + append + truncate to reach the create-new path.
FsOpenOptions is documented as mirroring std::fs::OpenOptions, but the checks at lines 311–316 reject append(true).create_new(true).truncate(true) unconditionally. In std, create_new(true) ignores truncate(true), so this combination is explicitly allowed and should be accepted here.
Proposed fix
if opts.truncate && opts.append {
+ if opts.create_new {
+ // create_new(true) ignores truncate(true), so this is allowed
+ } else {
return Err(io::Error::new(
io::ErrorKind::InvalidInput,
"truncate and append cannot be used together",
));
+ }
}
if opts.truncate && !opts.write {
return Err(io::Error::new(
io::ErrorKind::InvalidInput,
- "truncate requires write access",
+ "truncate requires write or append access",
));
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/fs/mem_fs.rs` around lines 311 - 323, The code currently rejects
append+truncate unconditionally; change the checks so truncate+append is only an
error when create_new is not set. Specifically, update the first check around
opts.truncate && opts.append to allow the combination when opts.create_new is
true (e.g., only return the InvalidInput error if opts.truncate && opts.append
&& !opts.create_new). Likewise, update the truncate vs write check to not fail
when opts.create_new is set (i.e., only error if opts.truncate && !opts.write &&
!opts.create_new) so create_new+append+truncate follows std::fs::OpenOptions
behavior; locate and modify these checks around the opts variable in the
function that handles FsOpenOptions/wants_write in mem_fs.rs.
| /// | ||
| /// This is used in case file descriptor cache is `None` (to skip cache lookups) | ||
| File(Arc<dyn FsFile>), | ||
|
|
||
| /// Access to file descriptor cache | ||
| DescriptorTable(Arc<DescriptorTable>), | ||
| /// Access to file descriptor cache with [`Fs`]-based fallback for | ||
| /// cache misses. | ||
| DescriptorTable { | ||
| /// The FD cache. | ||
| table: Arc<DescriptorTable>, | ||
| /// Filesystem backend for opening files on cache miss. | ||
| fs: Arc<dyn Fs>, | ||
| }, | ||
|
|
||
| /// Sentinel used during [`Drop`] to move ownership of the file handle | ||
| /// before deleting the underlying file. Not constructed outside `Drop`. | ||
| #[doc(hidden)] | ||
| Closed, | ||
| } |
There was a problem hiding this comment.
FileAccessor is a publicly exported enum (#[doc(hidden)] pub mod file_accessor), and this change adds a new Closed variant. Adding variants to a public enum is a semver-breaking API change and also allows external callers to construct FileAccessor::Closed, which is meant to be an internal sentinel (per the doc comment). Consider making FileAccessor pub(crate) (or moving it to a private module), or marking it #[non_exhaustive] and providing a crate-private sentinel type so external code cannot construct the internal-only state.
|
Closing to squash history and reopen with clean commits. |
Summary
MemFs+MemFile—HashMap<PathBuf, Arc<Mutex<Vec<u8>>>>-backed virtual filesystem implementing theFstraitConfig::with_fs()builder method for plugging alternative Fs backendsConfig<F: Fs>→ConfigwithArc<dyn Fs>(breaking forConfig<CustomFs>users — none exist in practice since the generic was introduced in this fork)rewrite_atomicto useFstrait instead oftempfilecrate (fixes refactor(fs): rewrite_atomic — replace tempfile crate with Fs-trait atomic write #188), with PID+seq temp naming, retry on AlreadyExists, and best-effort cleanupArc<dyn Fs>throughTable::recover,FileAccessor,load_block, and vlog paths so all file opens go through the pluggable FsFsto allTable::recovercall sites (recovery uses the same backend that wrote the table)table_file_opened_cached/uncachedmetrics via(fd, cache_hit)return fromFileAccessor(pinned FDs are not counted as cache hits)std::fs::read_dir/try_existsin vlog recovery withFstrait callsFs::renametrait methodrust-toolchain.tomlfrom 1.94.0 → 1.94.1 to match CIrustfmtversionDesign Notes
NotFoundhandling:recover_blob_filesreturnsOk([], [])when the blobs folder doesn't exist andidsis empty (standard non-blob trees). Whenidsis non-empty (manifest references blob files), a missing folder is unrecoverable corruption and returnsErr(Unrecoverable).lock_exclusiveno-op in MemFs: In-memory files are not shared across processes. MemFs is a test/ephemeral backend — cross-process exclusivity is not meaningful.Known Limitations
Tree::openusestry_exists()andstd::fs-based recovery, bypassing theFstrait. New trees work; reopening in-memory trees is not supported. Tracked in refactor(fs): migrate Tree::open recovery path to Fs trait #209.SuperVersions::gcusesstd::fsdirectly — stale entries accumulate until MemFs is dropped. Tracked in refactor(fs): migrate Tree::open recovery path to Fs trait #209.Fstrait. Write + flush + point-read works; compaction may fail with ENOENT on virtual paths.Test Plan
cargo fmt -- --checkclean (rustfmt 1.94.1)Closes #187
Closes #188
Summary by CodeRabbit
New Features
Improvements
Tests
Chores