feat: add intra-L0 compaction for overlapping runs#276
feat: add intra-L0 compaction for overlapping runs#276polaz wants to merge 2 commits intofjall-rs:mainfrom
Conversation
When L0 accumulates multiple overlapping runs but the table count is still below the L0→L1 threshold, merge them into a single run within L0 instead of waiting for a full leveled compaction cycle. This reduces read amplification during write bursts. - Add intra-L0 compaction path in leveled strategy (dest_level=0) - Fix with_merge to append (not prepend) merged run at L0 so newer concurrent flushes remain at the front and are searched first - Add tests for intra-L0 merge and run ordering correctness
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThe changes implement intra-L0 compaction, where overlapping tables within level 0 are consolidated into a single run under specified conditions. This includes control flow logic to prefer L0-to-L0 merging before other compaction paths, merge placement logic to preserve run ordering, and comprehensive test coverage validating the consolidation and ordering behavior. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Pull request overview
This PR introduces an intra-L0 compaction path for the leveled compaction strategy and adjusts L0 run insertion semantics during merges to preserve newer-run-first lookup ordering.
Changes:
- Add an intra-L0 compaction choice in the leveled strategy to merge multiple L0 runs into a single L0 run when below the L0→L1 threshold.
- Change
Version::with_mergebehavior fordest_level == 0to append (not prepend) the merged run, keeping concurrently flushed newer runs at the front for point reads. - Add unit tests covering intra-L0 compaction behavior and basic post-compaction read correctness.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/version/mod.rs |
Adjusts L0 merge run insertion order to preserve newer L0 runs ahead of merged intra-L0 output. |
src/compaction/leveled/mod.rs |
Adds strategy logic to select intra-L0 compaction when multiple L0 runs exist but table count is below l0_threshold. |
src/compaction/leveled/test.rs |
Adds tests validating intra-L0 compaction consolidates tables/runs and preserves read correctness. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Rewrite leveled_intra_l0_preserves_newer_run_ordering to call Version::with_merge directly, simulating a concurrent flush during intra-L0 compaction. The previous test flushed after compact() completed, so insert(0, run) vs push(run) made no difference— the newer run was added by with_new_l0_run, not with_merge. The new test creates 3 L0 runs, then calls with_merge with only the 2 older runs' IDs in old_ids, verifying the newest (concurrently flushed) run stays at position 0 in L0.
…s#276 Cherry-picked from upstream contribution branches: - bae6679: document multi_get() output contract (same length, same order, None for missing) - bd7cfaf: rewrite intra-L0 ordering test to exercise Version::with_merge directly (conflict resolved: took upstream's improved test that validates run position)
…jall-rs#276" This reverts commit 8a14512.
There was a problem hiding this comment.
Pull request overview
This PR adjusts L0 run ordering during intra-L0 merges to preserve newest-first read semantics, and introduces a leveled-strategy option to compact multiple L0 runs into a single L0 run when still below the L0→L1 threshold.
Changes:
- Update
Version::with_mergeso intra-L0 merge outputs are appended (not prepended) to keep concurrently flushed newer L0 runs at the front. - Add leveled-strategy selection for intra-L0 compaction when L0 has multiple runs but total table count is still below
l0_threshold. - Add unit tests validating intra-L0 compaction consolidation and the corrected L0 run ordering behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/version/mod.rs |
Ensures intra-L0 merge results don’t preempt newer L0 runs, preventing stale point reads under concurrent flush scenarios. |
src/compaction/leveled/mod.rs |
Adds intra-L0 compaction choice to consolidate overlapping L0 runs before hitting the L0→L1 threshold. |
src/compaction/leveled/test.rs |
Adds tests for intra-L0 compaction behavior and verifies the newer-run ordering preservation after with_merge. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
marvin-j97
left a comment
There was a problem hiding this comment.
Also, benchmarking required! Intra-L0 is imo not easy to get right and needs at least some empirical proof that it is useful for certain workloads.
| if first_level.run_count() > 1 | ||
| && first_level.table_count() < usize::from(self.l0_threshold) | ||
| && !version.level_is_busy(0, state.hidden_set()) | ||
| { | ||
| return Choice::Merge(CompactionInput { | ||
| table_ids: first_level.list_ids(), | ||
| dest_level: 0, | ||
| canonical_level: 0, | ||
| target_size: self.target_size, | ||
| }); | ||
| } |
There was a problem hiding this comment.
This misses the intention of intra-L0:
- it's too expensive (it does not consider the size ratio between L0 and L1)
- not the reason why Intra-L0 compaction is typically performed - it is done when L1 is blocked
- wrong because the table_count is not relevant too us if all tables form a single run (though rare)
- we may not want to merge all tables in L0
| if level_idx == dest_level { | ||
| if let Some(run) = Run::new(new_tables.to_vec()) { | ||
| runs.insert(0, run); | ||
| if dest_level == 0 { |
There was a problem hiding this comment.
This is flimsy once intra-L0 are not all-or-nothing compactions.
Summary
Technical Details
Strategy change (
src/compaction/leveled/mod.rs): After trivial-move checks but before scoring, if L0 has multiple runs and the table count is belowl0_threshold, emitChoice::Mergewithdest_level=0andcanonical_level=0to merge all L0 runs in place.Run ordering fix (
src/version/mod.rs): Inwith_merge, whendest_level == 0(intra-L0 only), append the merged run to the end instead of inserting at position 0. This ensures concurrently flushed (newer) runs remain at the front and are searched first during point reads. Memtable flushes usewith_new_l0_run, notwith_merge, so this path is exclusive to intra-L0 compaction.Test plan
leveled_intra_l0_compaction— verifies 3 overlapping L0 runs merge into 1, data stays in L0, all keys readableleveled_intra_l0_preserves_newer_run_ordering— verifies a post-compaction flush wins over the merged runcargo test --all-featuresgreen (232 tests pass)Supersedes #260 (clean rebased branch — sorry about the mess in that one).
Summary by CodeRabbit
Release Notes
Refactor
Tests