refactor(fs): thread Fs through table::Writer and BlobFile creation#107
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (25)
📝 WalkthroughWalkthroughThreads the Fs filesystem abstraction through writers, blob-file creation, version persistence, directory fsyncs, and related call sites so IO uses injected filesystem implementations instead of direct std::fs calls. (Adds fs parameters/fields and updates constructors and persist/upgrade call sites.) Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Component (Compaction / Ingest / Tree)
participant Writer as Table/Blob Writer
participant Version as SuperVersions
participant FS as Fs implementation
participant Disk as Underlying storage
Client->>Writer: new(..., fs.clone())
Writer->>FS: open(path, FsOpenOptions { write, create_new })
FS->>Disk: create file
Writer->>FS: write(data)
Writer->>FS: FsFile::sync_all()
Client->>Version: upgrade_version(..., fs)
Version->>FS: persist_version(folder, manifest, comparator, fs)
FS->>Disk: write version file
FS->>Disk: sync directory
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR refactors the write path to consistently use the crate’s Fs abstraction instead of hardcoded std::fs, enabling alternative filesystem backends (e.g., io_uring-oriented) and future per-level filesystem routing.
Changes:
- Generalizes table index/filter writer traits to accept a generic writer type (instead of
std::fs::File-specificfinish()signatures). - Makes table writers and blob-file writers generic over
FS: Fs = StdFs, and threadsfsthrough version persistence and atomic rewrite helpers. - Updates call sites across tree creation/ingestion/compaction and tests to pass
config.fsthrough.
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/vlog/blob_file/writer.rs | Makes blob writer generic over FS and switches file creation/sync to Fs/FsFile. |
| src/vlog/blob_file/multi_writer.rs | Threads fs through blob multi-writer lifecycle and deletion paths. |
| src/vlog/blob_file/scanner.rs | Updates tests to construct writers with StdFs. |
| src/vlog/blob_file/reader.rs | Updates tests to pass an Arc<StdFs> into blob writer construction. |
| src/vlog/blob_file/merge.rs | Updates tests to construct writers with StdFs. |
| src/table/writer/mod.rs | Makes table writer generic over FS, uses Fs for create/remove/sync, and updates writer trait object types. |
| src/table/writer/index/* | Generalizes index writer trait and implementations to accept W: Write + Seek. |
| src/table/writer/filter/* | Generalizes filter writer trait and implementations to accept W: Write + Seek. |
| src/table/multi_writer.rs | Makes table multi-writer generic over FS and passes fs into rotated writers. |
| src/table/tests.rs | Updates table tests to construct writers with Arc<StdFs>. |
| src/file.rs | Threads Fs through rewrite_atomic() and fsync_directory() for directory syncing. |
| src/version/persist.rs | Threads Fs through version persistence and atomic rewrite of CURRENT. |
| src/version/super_version.rs | Threads Fs through version upgrade/persist call chain. |
| src/tree/mod.rs | Passes config.fs into writer/version upgrade paths and directory fsyncs. |
| src/tree/inner.rs | Passes config.fs into initial version persistence. |
| src/tree/ingest.rs | Passes config.fs into ingestion writers and version upgrade paths. |
| src/compaction/* | Passes config.fs into compaction writers and version upgrade paths. |
| src/blob_tree/* | Passes config.fs into blob-tree writers and version upgrade paths. |
53a280e to
8e1dce1
Compare
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.
| Benchmark suite | Current: 780bbf9 | Previous: d7279d3 | Ratio |
|---|---|---|---|
readseq |
2336815.33618255 ops/sec |
3054133.461936785 ops/sec |
1.31 |
mergerandom |
644995.6304771013 ops/sec |
784591.1446116187 ops/sec |
1.22 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/table/writer/index/mod.rs:1
- Requiring
W: Write + Seekat the trait level forces all backends (and thusFS::Fileused by table writers) to be seekable, which can block non-seekable write-oriented implementations (e.g., some io_uring-style or append-only abstractions). If only some implementations require seeking, consider moving theSeekbound to the specific methods/implementations that need it (e.g., via method-levelwhere W: Seek), or splitting into separate traits for seek-free vs seek-required index writers.
// Copyright (c) 2024-present, fjall-rs
8e1dce1 to
4935dfb
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
src/table/writer/mod.rs:1
- This bypasses the existing
file::fsync_directory(...)wrapper which is explicitly a no-op on Windows. UnlessFs::sync_directoryis guaranteed to be Windows-safe/no-op by contract (including for non-StdFs implementations), this can cause Windows failures/regressions. Consider callingcrate::file::fsync_directory(parent, &*self.fs)here (or adding a cfg/no-op guarantee to theFstrait contract) to preserve cross-platform behavior.
// Copyright (c) 2024-present, fjall-rs
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
0bef6cc to
10f2477
Compare
f2769c8 to
708bfa9
Compare
e0c749d to
ff78409
Compare
- Generalize BlockIndexWriter/FilterWriter traits to use generic W instead of hardcoded std::fs::File in finish() methods - Make table::Writer<FS: Fs = StdFs> generic, use Fs::open() for file creation, Fs::sync_directory() and Fs::remove_file() in finish() - Make table::MultiWriter<FS: Fs = StdFs> generic, pass Fs through - Make vlog::blob_file::Writer<FS: Fs = StdFs> generic, use Fs::open() - Make vlog::blob_file::MultiWriter<FS: Fs = StdFs> generic - Thread Fs through rewrite_atomic(), fsync_directory(), persist_version() and upgrade_version()/upgrade_version_with_seqno() - Update all call sites to pass config.fs through Closes #91
…path - BlobFile Writer::new now uses create_new(true) instead of create(true), matching the table writer pattern and preventing silent overwrites - Add comment explaining why consume_writer read-back still uses std::fs::File (FileAccessor is not yet generic over Fs)
Add doc comments noting these are not public API, so the added Fs parameter is not a breaking change for external consumers.
FsFile: Write + Seek is enforced by the Fs trait, so no explicit where-clause is needed on Writer<FS>.
Writer is re-exported via table::mod, so the comment was misleading.
Use plain language (`implements FsFile + Write + Seek`) instead of nested-bound syntax that reads like `FsFile: Write + Seek`.
…ectory The helper handles the Windows no-op case, keeping behavior consistent with all other fsync call sites.
Explain that BlockIndexWriter/FilterWriter are generic over W = BufWriter<FS::File>, not directly over FS::File.
Blob file IDs are monotonically unique, so path collision is a bug. Orphaned files from crashes are cleaned during recovery.
Move FsOpenOptions import inside #[cfg(not(windows))] block and add let _ = &fs to suppress unused-variable warning on Windows where post-persist sync is skipped.
Replace remaining std::fs::create_dir_all calls in Tree::create_new, recover_levels, and BlobTree::open with Fs::create_dir_all for consistency with the Fs abstraction. Also add note about tempfile crate's std::fs dependency in rewrite_atomic.
The Windows fsync_directory no-op doesn't use path, so prefix with _ to suppress unused-variable warnings, consistent with the _fs param.
Ensures the existence check goes through the configured Fs backend, consistent with the create_dir_all and fsync_directory calls that follow.
Move std::path::absolute() before fs.open() so open, remove_file, and fsync_directory all operate on the same absolute path. Previously open() received the raw path while cleanup used the absolutized one.
Replace generic FS: Fs type parameters with Arc<dyn Fs> / &dyn Fs throughout Writer, MultiWriter, and helpers, following the Fs trait simplification in #109 that removed associated types in favor of Box<dyn FsFile> returns.
Replace std::fs::read_dir with config.fs.read_dir for consistency with the Fs abstraction. Destructure FsDirEntry directly instead of calling std::fs::DirEntry methods.
93eee6d to
34e94ca
Compare
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
Summary
BlockIndexWriter/FilterWritertraits to use genericWinstead of hardcodedstd::fs::Fileinfinish()methodstable::Writer,table::MultiWriter,vlog::blob_file::Writer,vlog::blob_file::MultiWriteruseArc<dyn Fs>/Box<dyn FsFile>for pluggable filesystem backendsFsthroughrewrite_atomic(),fsync_directory(),persist_version(), andupgrade_version()std::fs::create_dir_all/Path::try_existswithFs::create_dir_all/Fs::existsin tree creation and recoveryconfig.fsthroughThis eliminates the last direct
std::fsdependency from the write path, enabling:Test plan
cargo test --lib --all-features— 519 passed, 0 failedCloses #91