Feat: Add batch write implementation by kunalmohan · Pull Request #290 · fjall-rs/lsm-tree

kunalmohan · 2026-04-26T17:53:51Z

Fix #283

Note: This builds on #288

This is a two step implementation:

A batch_write in AbstractTree which calls memtable::insert() in a loop.
A further optimization (ad76efc266d562a97903d8e7c853701c8bb89d22) attempt is made by adding a new memtable::insert_batch() which loops on the items internally but performs atomic replace only once in the end.

Results from benchmarks:

Results do not look promising: There is very minimal performace improvement with both the optimizations.

memtable::insert() loop:

memtable::insert_batch():

Summary by CodeRabbit

New Features
- Added batch write support to tree operations with a new BatchItem type, enabling multiple insert and delete operations to be submitted in a single batch call, improving write efficiency.

coderabbitai · 2026-04-26T17:54:03Z

Warning

Rate limit exceeded

@kunalmohan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 32 minutes and 32 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 32 minutes and 32 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b0cc8a1a-b003-40ca-b75e-fdfc294e4976

📥 Commits

Reviewing files that changed from the base of the PR and between ad76efc and e320acb.

📒 Files selected for processing (2)

benches/tree.rs
src/memtable/mod.rs

📝 Walkthrough

Walkthrough

The PR introduces a write_batch API to AbstractTree for optimized batch writes that reduce atomic operations, implements internal batch insertion in Memtable, and removes several benchmark files while updating remaining benchmarks to use updated APIs.

Changes

Cohort / File(s)	Summary
Batch Write API `src/abstract_tree.rs`, `src/lib.rs`	Adds public `BatchItem` enum (Insert, Remove, RemoveWeak variants) and `write_batch` trait method to `AbstractTree`; re-exports `BatchItem` from lib.
Tree Implementation `src/tree/mod.rs`, `src/blob_tree/mod.rs`	Implements `write_batch` method for both `Tree` and `BlobTree`, converting `BatchItem` variants to internal values and delegating to memtable batch insertion.
Memtable Batching `src/memtable/mod.rs`	Adds `insert_batch` method that accumulates item sizes and performs atomic updates once per batch instead of per-item; includes corresponding test.
Benchmark Deletions `Cargo.toml`, `benches/tli.rs`, `benches/block.rs`, `benches/partition_point.rs`	Removes three benchmark targets and their associated files (TLI, block compression, and partition_point benchmarks).
Benchmark Updates `benches/bloom.rs`, `benches/level_manifest.rs`, `benches/memtable.rs`, `benches/merge.rs`, `benches/tree.rs`	Updates benchmark code to use new `SequenceNumberCounter`, `SeqNo::MAX` instead of `None`/`Some`, new cache/blob-tree APIs, and adds new `tree_batch_write` benchmark group; removes blocked-bloom-filter and `find_segment` benchmarks.

Sequence Diagram(s)

sequenceDiagram
    actor User as Client/User
    participant AT as AbstractTree<br/>(Tree/BlobTree)
    participant MT as Memtable
    participant SM as SkipMap
    
    User->>AT: write_batch(items, seqno)
    AT->>AT: Convert BatchItem to InternalValue
    AT->>MT: insert_batch(internal_values)
    
    rect rgba(100, 150, 255, 0.5)
    Note over MT: Batch Processing
    MT->>SM: Insert all items
    MT->>MT: Accumulate total_size
    MT->>MT: Atomic update:<br/>approximate_size += total_size
    MT->>MT: Atomic update:<br/>highest_seqno = seqno
    end
    
    MT-->>AT: (inserted_size, memtable_size)
    AT-->>User: (inserted_size, memtable_size)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

3.1.0 #251: Modifies core tree/blob modules and AbstractTree-related APIs and exports, directly related to this PR's public API changes.

Suggested labels

epic

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Out of Scope Changes check	❓ Inconclusive	Several benchmark files were removed (tli.rs, partition_point.rs, block.rs) and existing benchmarks modified, but these changes appear to be housekeeping adjustments unrelated to the batch write implementation objective.	Clarify whether benchmark file removals and modifications (beyond tree_batch_write) are intentional cleanup or should be deferred to a separate PR for scope clarity.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Feat: Add batch write implementation' clearly summarizes the main change—adding a batch write feature—which directly aligns with the primary changeset.
Linked Issues check	✅ Passed	The PR implements both required objectives: AbstractTree::write_batch method [`#283`] and benchmarks comparing batch writes vs naive loop approach, with results showing minimal performance improvement [`#283`].

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benches/tree.rs`:
- Around line 436-459: The benchmarks build the Tree once outside b.iter(), so
iterations accumulate state and measure overwrites/flush behavior; change both
benchmarks (the group.bench_function cases named "naive loop, {batch_size}
items" and "write_batch, {batch_size} items") to create a fresh TempDir/Tree
inside each timed iteration using Bencher::iter_batched or iter_with_setup so
each measured call starts with an empty tree, pass the prepared items (or
iterator of BatchItem::Insert) into the timed closure, and register throughput
via Criterion::Throughput::Elements(batch_size) so results report per-item cost;
locate code around b.iter(), tree, items, and write_batch to implement this
change.

In `@src/abstract_tree.rs`:
- Around line 20-33: The enum BatchItem currently declares a V: Into<UserValue>
bound which forces callers issuing only removals to provide a concrete V type;
remove the V: Into<UserValue> bound from the enum declaration and instead
constrain conversion where needed — either (a) make the Insert variant accept a
generic V via a dedicated constructor/factory (e.g., impl BatchItem { pub fn
insert<K, V: Into<UserValue>>(k: K, v: V) -> BatchItem<K> { ... } } or similar)
so only Insert requires Into<UserValue>, or (b) drop the bound on BatchItem
entirely and enforce V: Into<UserValue> at the write_batch callsite where
inserts are converted; update usages of BatchItem::Insert and write_batch
accordingly so type inference for Remove/RemoveWeak no longer needs a V type.

In `@src/memtable/mod.rs`:
- Around line 180-209: insert_batch currently inserts into self.items before
updating approximate_size and highest_seqno, allowing readers to see items while
size/seqno are still stale; change insert_batch to first iterate items to
compute total_item_size and the max seqno (without inserting), then call
self.approximate_size.fetch_add(total_item_size, Ordering::AcqRel) and update
highest_seqno (e.g., fetch_max) before doing the second loop that performs
self.items.insert(...) for each item, mirroring the single-item insert ordering
and using the existing symbols: self.items.insert, self.approximate_size, and
highest_seqno/seqno handling.
- Around line 178-214: batch_seqno in Memtable::insert_batch currently captures
only the first item's SeqNo so highest_seqno may be wrong for mixed-seqno
batches; update the loop that iterates over items to track the maximum seqno
(e.g., maintain a max_seqno Option<SeqNo> or compare each item.key.seqno)
instead of setting batch_seqno only once, then after the loop call
self.highest_seqno.fetch_max(max_seqno, Ordering::AcqRel) if present; refer to
functions/fields insert_batch, items, batch_seqno, highest_seqno and SeqNo to
locate and change the logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0edd05bd-9bc3-490f-b0ba-50a1551404e3

📥 Commits

Reviewing files that changed from the base of the PR and between bdd0544 and ad76efc.

📒 Files selected for processing (14)

Cargo.toml
benches/block.rs
benches/bloom.rs
benches/level_manifest.rs
benches/memtable.rs
benches/merge.rs
benches/partition_point.rs
benches/tli.rs
benches/tree.rs
src/abstract_tree.rs
src/blob_tree/mod.rs
src/lib.rs
src/memtable/mod.rs
src/tree/mod.rs

💤 Files with no reviewable changes (5)

Cargo.toml
benches/partition_point.rs
benches/tli.rs
benches/bloom.rs
benches/block.rs

codecov · 2026-04-26T18:23:14Z

Codecov Report

❌ Patch coverage is 88.52459% with 7 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/memtable/mod.rs	85.71%	5 Missing ⚠️
src/blob_tree/mod.rs	83.33%	1 Missing ⚠️
src/tree/mod.rs	91.66%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

kunalmohan · 2026-04-26T18:23:42Z

Updated benchmark:
(Still not much improvement)

kunalmohan added 8 commits April 19, 2026 22:42

Fix/update benchmarks

071c2c2

fix benches with lz4 feature

3f4c496

address review comments

2410660

Implement Batch_write for tree

3fa423f

Add benchmark

40cb32f

fix benchmark

7717825

nit fix

ad5b73c

add memtable write_batch

ad76efc

coderabbitai Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread benches/tree.rs

Comment thread src/abstract_tree.rs

Comment thread src/memtable/mod.rs Outdated

Comment thread src/memtable/mod.rs Outdated

address review comments

e320acb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Add batch write implementation#290

Feat: Add batch write implementation#290
kunalmohan wants to merge 9 commits into
fjall-rs:mainfrom
kunalmohan:feat/283-batch-write

kunalmohan commented Apr 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 26, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Apr 26, 2026

Uh oh!

kunalmohan commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

kunalmohan commented Apr 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Apr 26, 2026

Codecov Report

Uh oh!

kunalmohan commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kunalmohan commented Apr 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 26, 2026 •

edited

Loading