Feat: Add batch write implementation#290
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 32 minutes and 32 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe PR introduces a Changes
Sequence Diagram(s)sequenceDiagram
actor User as Client/User
participant AT as AbstractTree<br/>(Tree/BlobTree)
participant MT as Memtable
participant SM as SkipMap
User->>AT: write_batch(items, seqno)
AT->>AT: Convert BatchItem to InternalValue
AT->>MT: insert_batch(internal_values)
rect rgba(100, 150, 255, 0.5)
Note over MT: Batch Processing
MT->>SM: Insert all items
MT->>MT: Accumulate total_size
MT->>MT: Atomic update:<br/>approximate_size += total_size
MT->>MT: Atomic update:<br/>highest_seqno = seqno
end
MT-->>AT: (inserted_size, memtable_size)
AT-->>User: (inserted_size, memtable_size)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@benches/tree.rs`:
- Around line 436-459: The benchmarks build the Tree once outside b.iter(), so
iterations accumulate state and measure overwrites/flush behavior; change both
benchmarks (the group.bench_function cases named "naive loop, {batch_size}
items" and "write_batch, {batch_size} items") to create a fresh TempDir/Tree
inside each timed iteration using Bencher::iter_batched or iter_with_setup so
each measured call starts with an empty tree, pass the prepared items (or
iterator of BatchItem::Insert) into the timed closure, and register throughput
via Criterion::Throughput::Elements(batch_size) so results report per-item cost;
locate code around b.iter(), tree, items, and write_batch to implement this
change.
In `@src/abstract_tree.rs`:
- Around line 20-33: The enum BatchItem currently declares a V: Into<UserValue>
bound which forces callers issuing only removals to provide a concrete V type;
remove the V: Into<UserValue> bound from the enum declaration and instead
constrain conversion where needed — either (a) make the Insert variant accept a
generic V via a dedicated constructor/factory (e.g., impl BatchItem { pub fn
insert<K, V: Into<UserValue>>(k: K, v: V) -> BatchItem<K> { ... } } or similar)
so only Insert requires Into<UserValue>, or (b) drop the bound on BatchItem
entirely and enforce V: Into<UserValue> at the write_batch callsite where
inserts are converted; update usages of BatchItem::Insert and write_batch
accordingly so type inference for Remove/RemoveWeak no longer needs a V type.
In `@src/memtable/mod.rs`:
- Around line 180-209: insert_batch currently inserts into self.items before
updating approximate_size and highest_seqno, allowing readers to see items while
size/seqno are still stale; change insert_batch to first iterate items to
compute total_item_size and the max seqno (without inserting), then call
self.approximate_size.fetch_add(total_item_size, Ordering::AcqRel) and update
highest_seqno (e.g., fetch_max) before doing the second loop that performs
self.items.insert(...) for each item, mirroring the single-item insert ordering
and using the existing symbols: self.items.insert, self.approximate_size, and
highest_seqno/seqno handling.
- Around line 178-214: batch_seqno in Memtable::insert_batch currently captures
only the first item's SeqNo so highest_seqno may be wrong for mixed-seqno
batches; update the loop that iterates over items to track the maximum seqno
(e.g., maintain a max_seqno Option<SeqNo> or compare each item.key.seqno)
instead of setting batch_seqno only once, then after the loop call
self.highest_seqno.fetch_max(max_seqno, Ordering::AcqRel) if present; refer to
functions/fields insert_batch, items, batch_seqno, highest_seqno and SeqNo to
locate and change the logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0edd05bd-9bc3-490f-b0ba-50a1551404e3
📒 Files selected for processing (14)
Cargo.tomlbenches/block.rsbenches/bloom.rsbenches/level_manifest.rsbenches/memtable.rsbenches/merge.rsbenches/partition_point.rsbenches/tli.rsbenches/tree.rssrc/abstract_tree.rssrc/blob_tree/mod.rssrc/lib.rssrc/memtable/mod.rssrc/tree/mod.rs
💤 Files with no reviewable changes (5)
- Cargo.toml
- benches/partition_point.rs
- benches/tli.rs
- benches/bloom.rs
- benches/block.rs
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |

Fix #283
Note: This builds on #288
This is a two step implementation:
AbstractTreewhich callsmemtable::insert()in a loop.memtable::insert_batch()which loops on the items internally but performs atomic replace only once in the end.Results from benchmarks:
Results do not look promising: There is very minimal performace improvement with both the optimizations.
memtable::insert() loop:

memtable::insert_batch():

Summary by CodeRabbit
BatchItemtype, enabling multiple insert and delete operations to be submitted in a single batch call, improving write efficiency.