Skip to content

feat: add intra-L0 compaction for overlapping runs#276

Open
polaz wants to merge 2 commits intofjall-rs:mainfrom
structured-world:feat/intra-l0-compaction-clean
Open

feat: add intra-L0 compaction for overlapping runs#276
polaz wants to merge 2 commits intofjall-rs:mainfrom
structured-world:feat/intra-l0-compaction-clean

Conversation

@polaz
Copy link
Copy Markdown

@polaz polaz commented Mar 16, 2026

Summary

  • Add intra-L0 compaction that merges overlapping runs within L0 when too many accumulate
  • Reduces read amplification during write bursts by bounding L0 run count
  • Triggered when L0 run count exceeds 1 but table count is still below the L0→L1 threshold

Technical Details

Strategy change (src/compaction/leveled/mod.rs): After trivial-move checks but before scoring, if L0 has multiple runs and the table count is below l0_threshold, emit Choice::Merge with dest_level=0 and canonical_level=0 to merge all L0 runs in place.

Run ordering fix (src/version/mod.rs): In with_merge, when dest_level == 0 (intra-L0 only), append the merged run to the end instead of inserting at position 0. This ensures concurrently flushed (newer) runs remain at the front and are searched first during point reads. Memtable flushes use with_new_l0_run, not with_merge, so this path is exclusive to intra-L0 compaction.

Test plan

  • leveled_intra_l0_compaction — verifies 3 overlapping L0 runs merge into 1, data stays in L0, all keys readable
  • leveled_intra_l0_preserves_newer_run_ordering — verifies a post-compaction flush wins over the merged run
  • Full cargo test --all-features green (232 tests pass)

Supersedes #260 (clean rebased branch — sorry about the mess in that one).

Summary by CodeRabbit

Release Notes

  • Refactor

    • Optimized internal table consolidation strategy to reduce compaction overhead and improve storage efficiency
    • Enhanced merge ordering behavior to better optimize read performance during specific data consolidation scenarios
  • Tests

    • Added comprehensive test coverage for table consolidation behavior and merge ordering preservation across multiple data flush and compaction scenarios

When L0 accumulates multiple overlapping runs but the table count is
still below the L0→L1 threshold, merge them into a single run within
L0 instead of waiting for a full leveled compaction cycle. This reduces
read amplification during write bursts.

- Add intra-L0 compaction path in leveled strategy (dest_level=0)
- Fix with_merge to append (not prepend) merged run at L0 so newer
  concurrent flushes remain at the front and are searched first
- Add tests for intra-L0 merge and run ordering correctness
Copilot AI review requested due to automatic review settings March 16, 2026 13:48
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 16, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c33753c2-6604-47b1-9ef5-9b940095af35

📥 Commits

Reviewing files that changed from the base of the PR and between aae89a0 and bd7cfaf.

📒 Files selected for processing (3)
  • src/compaction/leveled/mod.rs
  • src/compaction/leveled/test.rs
  • src/version/mod.rs

📝 Walkthrough

Walkthrough

The changes implement intra-L0 compaction, where overlapping tables within level 0 are consolidated into a single run under specified conditions. This includes control flow logic to prefer L0-to-L0 merging before other compaction paths, merge placement logic to preserve run ordering, and comprehensive test coverage validating the consolidation and ordering behavior.

Changes

Cohort / File(s) Summary
Intra-L0 Compaction Strategy
src/compaction/leveled/mod.rs
Adds conditional logic to detect when L0 contains multiple runs but remains below the threshold, triggering an intra-L0 merge path that consolidates all L0 tables into a single run with destination and canonical levels set to 0.
Run Placement on Merge
src/version/mod.rs
Modifies Version::with_merge to append merged runs to the end when destination level is 0 (preserving newer run ordering for L0), while maintaining front insertion for non-L0 destinations.
Test Coverage
src/compaction/leveled/test.rs
Adds two tests: leveled_intra_l0_compaction validates that overlapping memtable flushes consolidate into a single L0 run, and leveled_intra_l0_preserves_newer_run_ordering verifies run ordering is maintained post-merge. Includes minor import adjustment to expose SeqNo type.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add intra-L0 compaction for overlapping runs' is a clear, specific, and concise summary of the primary change—adding a new intra-L0 compaction mechanism for overlapping runs within level 0.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an intra-L0 compaction path for the leveled compaction strategy and adjusts L0 run insertion semantics during merges to preserve newer-run-first lookup ordering.

Changes:

  • Add an intra-L0 compaction choice in the leveled strategy to merge multiple L0 runs into a single L0 run when below the L0→L1 threshold.
  • Change Version::with_merge behavior for dest_level == 0 to append (not prepend) the merged run, keeping concurrently flushed newer runs at the front for point reads.
  • Add unit tests covering intra-L0 compaction behavior and basic post-compaction read correctness.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/version/mod.rs Adjusts L0 merge run insertion order to preserve newer L0 runs ahead of merged intra-L0 output.
src/compaction/leveled/mod.rs Adds strategy logic to select intra-L0 compaction when multiple L0 runs exist but table count is below l0_threshold.
src/compaction/leveled/test.rs Adds tests validating intra-L0 compaction consolidates tables/runs and preserves read correctness.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/compaction/leveled/mod.rs
Comment thread src/compaction/leveled/test.rs
Rewrite leveled_intra_l0_preserves_newer_run_ordering to call
Version::with_merge directly, simulating a concurrent flush during
intra-L0 compaction. The previous test flushed after compact()
completed, so insert(0, run) vs push(run) made no difference—
the newer run was added by with_new_l0_run, not with_merge.

The new test creates 3 L0 runs, then calls with_merge with only
the 2 older runs' IDs in old_ids, verifying the newest
(concurrently flushed) run stays at position 0 in L0.
polaz added a commit to structured-world/coordinode-lsm-tree that referenced this pull request Mar 16, 2026
…s#276

Cherry-picked from upstream contribution branches:
- bae6679: document multi_get() output contract (same length,
  same order, None for missing)
- bd7cfaf: rewrite intra-L0 ordering test to exercise
  Version::with_merge directly (conflict resolved: took
  upstream's improved test that validates run position)
polaz added a commit to structured-world/coordinode-lsm-tree that referenced this pull request Mar 16, 2026
@polaz polaz requested a review from Copilot March 16, 2026 21:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts L0 run ordering during intra-L0 merges to preserve newest-first read semantics, and introduces a leveled-strategy option to compact multiple L0 runs into a single L0 run when still below the L0→L1 threshold.

Changes:

  • Update Version::with_merge so intra-L0 merge outputs are appended (not prepended) to keep concurrently flushed newer L0 runs at the front.
  • Add leveled-strategy selection for intra-L0 compaction when L0 has multiple runs but total table count is still below l0_threshold.
  • Add unit tests validating intra-L0 compaction consolidation and the corrected L0 run ordering behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/version/mod.rs Ensures intra-L0 merge results don’t preempt newer L0 runs, preventing stale point reads under concurrent flush scenarios.
src/compaction/leveled/mod.rs Adds intra-L0 compaction choice to consolidate overlapping L0 runs before hitting the L0→L1 threshold.
src/compaction/leveled/test.rs Adds tests for intra-L0 compaction behavior and verifies the newer-run ordering preservation after with_merge.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Copy Markdown
Contributor

@marvin-j97 marvin-j97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, benchmarking required! Intra-L0 is imo not easy to get right and needs at least some empirical proof that it is useful for certain workloads.

Comment on lines +406 to +416
if first_level.run_count() > 1
&& first_level.table_count() < usize::from(self.l0_threshold)
&& !version.level_is_busy(0, state.hidden_set())
{
return Choice::Merge(CompactionInput {
table_ids: first_level.list_ids(),
dest_level: 0,
canonical_level: 0,
target_size: self.target_size,
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This misses the intention of intra-L0:

  1. it's too expensive (it does not consider the size ratio between L0 and L1)
  2. not the reason why Intra-L0 compaction is typically performed - it is done when L1 is blocked
  3. wrong because the table_count is not relevant too us if all tables form a single run (though rare)
  4. we may not want to merge all tables in L0

Comment thread src/version/mod.rs
if level_idx == dest_level {
if let Some(run) = Run::new(new_tables.to_vec()) {
runs.insert(0, run);
if dest_level == 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is flimsy once intra-L0 are not all-or-nothing compactions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants