Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
dbb4ee4
feat: add optimized contains_prefix() method
polaz Mar 14, 2026
453e729
refactor(contains_prefix): accurate doc wording and test corrections
polaz Mar 14, 2026
c25e693
refactor(blob_tree): accurate contains_prefix override note
polaz Mar 14, 2026
1962eb5
perf: seqno-aware seek in data block point reads
polaz Mar 14, 2026
c52ec80
docs(test): clarify seqno snapshot visibility in test comment
polaz Mar 14, 2026
0513f33
docs(data_block): precise seek_to_key_seqno guarantees
polaz Mar 14, 2026
42d2c64
perf(data_block): single cmp in seek_to_key_seqno predicate
polaz Mar 14, 2026
cbf88d3
docs(test): describe restart_interval loop coverage
polaz Mar 14, 2026
1fddda0
perf(data_block): seqno-aware seek for iterator bounds
polaz Mar 15, 2026
2b0b265
refactor(data_block): dedup seek predicate, harden seqno tests
polaz Mar 15, 2026
95ae8ab
fix(docs): add backticks around identifiers in seek_to_key_seqno doc
polaz Mar 15, 2026
a03b0de
ci: add CoordiNode CI and upstream monitor workflows
polaz Mar 15, 2026
2462f33
docs: add maintained fork notice and support section
polaz Mar 15, 2026
d456379
ci: add dependabot configuration for cargo and actions
polaz Mar 15, 2026
68faa56
ci: add release-plz workflow for automated changelog and releases
polaz Mar 15, 2026
9bf3cf8
ci: split PR checks from full matrix, reduce PR to lint + ubuntu test
polaz Mar 15, 2026
3c7368c
Merge branch 'main' into feat/#138-optimized-containsprefix
polaz Mar 15, 2026
2f119dd
Merge branch 'main' into feat/#237-data-block-seqno-aware-seek
polaz Mar 15, 2026
994436c
fix: resolve all clippy warnings for strict -D warnings CI
polaz Mar 15, 2026
e16fce2
Merge remote-tracking branch 'origin/main' into fix/#2-clippy-warnings
polaz Mar 15, 2026
c21d272
fix(decompress): use runtime validation instead of debug_assert for b…
polaz Mar 15, 2026
cb85fd4
test(block): add corruption test for lz4 byte count validation
polaz Mar 15, 2026
a6a675a
test(vlog): add corruption test for lz4 blob reader byte count valida…
polaz Mar 15, 2026
8f8a154
fix(filter,vlog): guard zero-key division and use checked cast
polaz Mar 15, 2026
5607259
fix(test): use lz4_flex::compress instead of compress_prepend_size
polaz Mar 15, 2026
0376989
docs: add Copilot review instructions with scope and issue-suggestion…
polaz Mar 15, 2026
e967130
Merge remote-tracking branch 'origin/main' into fix/#2-clippy-warnings
polaz Mar 15, 2026
b22f937
ci: add Copilot code review instructions with scope rules
polaz Mar 15, 2026
a677f03
Merge remote-tracking branch 'origin/main' into fix/#2-clippy-warnings
polaz Mar 15, 2026
dbb763a
refactor: upgrade #[allow] to #[expect] with reasons on all suppressions
polaz Mar 15, 2026
5a0575e
docs(table): expand get_highest_seqno docstring, add mixed insert+ing…
polaz Mar 15, 2026
84562fa
Merge remote-tracking branch 'upstream/main'
polaz Mar 15, 2026
3f65399
refactor: compute add_size as usize, remove unreachable wildcard arms
polaz Mar 15, 2026
fc10b94
Merge branch 'main' into fix/#2-clippy-warnings
polaz Mar 15, 2026
364f366
Merge branch 'fix/#2-clippy-warnings' of github.com:structured-world/…
polaz Mar 15, 2026
1a7995a
fix(blob,block): use checked_add for read_len, document size cap scope
polaz Mar 15, 2026
0cee933
Merge pull request #12 from structured-world/fix/#2-clippy-warnings
polaz Mar 15, 2026
d811d02
Merge branch 'main' into docs/#265-seqno-docstring-and-test
polaz Mar 15, 2026
0ea0654
Merge branch 'main' into feat/#237-data-block-seqno-aware-seek
polaz Mar 15, 2026
80283a2
Merge branch 'main' into feat/#138-optimized-containsprefix
polaz Mar 15, 2026
cccff65
Merge pull request #14 from structured-world/docs/#265-seqno-docstrin…
polaz Mar 15, 2026
31fdb57
Merge branch 'main' into feat/#237-data-block-seqno-aware-seek
polaz Mar 15, 2026
b374e6d
Merge branch 'main' into feat/#138-optimized-containsprefix
polaz Mar 15, 2026
4d71fb1
fix: address review feedback on contains_prefix
polaz Mar 15, 2026
2590eb9
docs(data_block): document why reverse seeks accept but ignore seqno
polaz Mar 15, 2026
4a7d0ae
Merge pull request #6 from structured-world/feat/#138-optimized-conta…
polaz Mar 15, 2026
4c40606
Merge remote-tracking branch 'origin/main' into feat/#237-data-block-…
polaz Mar 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# GitHub Copilot Instructions for lsm-tree (structured-world fork)

## Project Overview

This is a **maintained fork** of [fjall-rs/lsm-tree](https://github.com/fjall-rs/lsm-tree) — a K.I.S.S. LSM-tree implementation in Rust. We maintain additional features and hardening for the [CoordiNode](https://github.com/structured-world/coordinode) database engine while contributing patches upstream.

## Review Scope Rules (CRITICAL)

**Review ONLY code within the PR's diff.** For issues found in code outside the diff:
- Do NOT suggest inline code fixes for unchanged lines
- Instead, suggest creating a **separate issue** with the finding (e.g., "Consider opening an issue to add size validation in `from_reader` — this is outside the scope of this clippy-fix PR")

**Each PR has a defined scope in its description.** Read the "out of scope" section before reviewing. If something is listed as out of scope, do not flag it — it is tracked in another PR.

**Cross-PR awareness:** This fork has multiple feature branches in parallel. If a hardening or feature seems missing, check whether it exists in another open PR before suggesting it. Reference the other PR number if known.

**Prefer issue suggestions over code suggestions for out-of-scope findings.** This keeps PRs focused and reviewable.

## Rust Code Standards

- **Unsafe code:** Prefer safe alternatives. If `unsafe` is required, it must have a `// SAFETY:` comment explaining the invariant.
- **Error handling:** No `unwrap()` or `expect()` on I/O paths. Use `Result<T, E>` propagation. `expect()` is acceptable for programmer invariants (e.g., lock poisoning) with `#[expect(clippy::expect_used, reason = "...")]`.
- **Clippy:** Code must pass `cargo clippy --all-features -- -D warnings`. Use `#[expect(...)]` (not `#[allow(...)]`) for justified suppressions — `#[expect]` warns if the suppression becomes unnecessary.
- **Casts:** Prefer `TryFrom`/`TryInto` for fallible conversions. `as` casts are acceptable for infallible cases (e.g., `u32` to `u64`) with `#[expect(clippy::cast_possible_truncation)]` and a reason.
- **Feature gates:** Code behind `#[cfg(feature = "...")]` must compile with any combination of features. Variables used only in feature-gated branches must also be feature-gated.

## Testing Standards

- **Corruption tests:** When adding validation for on-disk data, add a test that tampers the relevant field and asserts the error. Use the same serialization path as production (e.g., `lz4_flex::compress` not `compress_prepend_size`).
- **No mocks for storage:** Tests use real on-disk files via `tempfile::tempdir()`.
- **Test naming:** `fn <what>_<condition>_<expected>()` — e.g., `fn lz4_corrupted_header_triggers_decompress_error()`.

## Commit Message Format

```
<type>(scope): <description>

- Detail 1
- Detail 2
```

Types: `feat`, `fix`, `refactor`, `test`, `docs`, `style`, `chore`, `perf`, `ci`, `build`, `revert`

**Forbidden patterns:** "address review", "fix PR comments", "WIP", "temporary"

## Build and Test

```bash
cargo clippy --all-features -- -D warnings # Lint (strict)
cargo test --features lz4 # Tests with lz4
cargo test --all-features # Tests with all features
cargo fmt --all -- --check # Format check
```

## Feature Flags

| Flag | Description |
|------|-------------|
| `lz4` | LZ4 compression (enabled by default in fjall) |
| `zstd` | Zstd compression (PR #1) |
| `bytes_1` | Use `bytes` crate for Slice type |
| `metrics` | Expose prometheus metrics |

## Architecture Notes

- `src/table/block/` — On-disk block format (header + compressed payload)
- `src/vlog/blob_file/` — Value log for large values (separate from LSM blocks)
- `src/compaction/` — Compaction strategies (leveled, FIFO, tiered)
- `src/seqno.rs` — Sequence number generator (MVCC versioning)
- Compression is pluggable via `CompressionType` enum with `#[cfg(feature)]` variants
22 changes: 22 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
version: 2
updates:
- package-ecosystem: "cargo"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 10
groups:
minor-and-patch:
patterns:
- "*"
update-types:
- "minor"
- "patch"
commit-message:
prefix: "chore(deps)"
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
commit-message:
prefix: "ci(deps)"
27 changes: 27 additions & 0 deletions .github/instructions/code-review.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
applyTo: "**/*.rs"
---

# Code Review Instructions for lsm-tree (structured-world fork)

## Scope Rules (CRITICAL)

- **Review ONLY code within the PR's diff.** Do not suggest inline fixes for unchanged lines.
- For issues in code **outside the diff**, suggest creating a **separate issue** instead of proposing code changes. Example: "Consider opening an issue to add size validation here — this is outside this PR's scope."
- **Read the PR description carefully.** If the PR body has an "out of scope" section listing items handled by other PRs, do NOT flag those items.
- This fork has **multiple feature branches in parallel**. A hardening that seems missing in one PR may already exist in another open PR. Check the "out of scope" section for cross-references.

## Rust Standards

- `unsafe` blocks require `// SAFETY:` comments explaining the invariant
- Prefer `#[expect(lint)]` over `#[allow(lint)]` — `#[expect]` warns when suppression becomes unnecessary
- Use `TryFrom`/`TryInto` for fallible conversions; `as` casts need `#[expect(clippy::cast_possible_truncation)]` with reason
- No `unwrap()` / `expect()` on I/O paths — use `Result` propagation
- `expect()` is acceptable for programmer invariants (lock poisoning) with `#[expect(clippy::expect_used, reason = "...")]`
- Code must pass `cargo clippy --all-features -- -D warnings`

## Testing

- Corruption/validation tests: tamper the relevant on-disk field and assert the error
- Use same serialization as production (e.g., `lz4_flex::compress` not `compress_prepend_size`)
- Test naming: `fn <what>_<condition>_<expected>()`
92 changes: 92 additions & 0 deletions .github/workflows/coordinode-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
name: CoordiNode CI

on:
push:
branches:
- main
- "feat/#*"
- "fix/#*"
pull_request:
branches:
- main

env:
CARGO_TERM_COLOR: always

jobs:
lint:
# Fast gate — runs on every push and PR
timeout-minutes: 10
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy
- uses: Swatinem/rust-cache@v2
- name: Format
run: cargo fmt --all -- --check
- name: Clippy (strict)
run: cargo clippy --all-features -- -D warnings

test:
# Full matrix — only on push to main/feature branches (not on PRs)
if: github.event_name == 'push'
needs: lint
timeout-minutes: 20
strategy:
matrix:
rust_version: [stable, "1.90.0"]
os: [ubuntu-latest, windows-latest, macos-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v6
- uses: dtolnay/rust-toolchain@stable
with:
toolchain: ${{ matrix.rust_version }}
- uses: Swatinem/rust-cache@v2
with:
prefix-key: ${{ runner.os }}-cargo
- uses: taiki-e/install-action@nextest
- name: Run tests
run: cargo nextest run --all-features
- name: Run doc tests
run: cargo test --doc --features lz4

test-pr:
# Lightweight — only on PRs (ubuntu stable)
if: github.event_name == 'pull_request'
needs: lint
timeout-minutes: 15
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
- uses: taiki-e/install-action@nextest
- name: Run tests
run: cargo nextest run --all-features
- name: Run doc tests
run: cargo test --doc --features lz4

codecov:
if: github.event_name == 'push'
needs: lint
timeout-minutes: 20
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: dtolnay/rust-toolchain@nightly
with:
components: llvm-tools-preview
- uses: Swatinem/rust-cache@v2
- uses: taiki-e/install-action@cargo-llvm-cov
- uses: taiki-e/install-action@nextest
- run: cargo llvm-cov --no-report nextest --all-features
- run: cargo llvm-cov --no-report --doc --features lz4
- run: cargo llvm-cov report --doctests --lcov --output-path lcov.info
- uses: codecov/codecov-action@v5
with:
files: lcov.info
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
31 changes: 31 additions & 0 deletions .github/workflows/coordinode-release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Release

on:
push:
branches:
- main

permissions:
contents: write
pull-requests: write

jobs:
release-plz:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v6
with:
fetch-depth: 0

- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@stable

- name: Run release-plz
uses: release-plz/action@v0.5
with:
command: release-pr
config: .release-plz.toml
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
130 changes: 130 additions & 0 deletions .github/workflows/upstream-monitor.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
name: Upstream Monitor

on:
schedule:
- cron: "0 8 * * 1,4"
workflow_dispatch:

permissions:
contents: write
pull-requests: write

jobs:
check-upstream:
timeout-minutes: 10
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v6
with:
fetch-depth: 0

- name: Add upstream remote
run: git remote add upstream https://github.com/fjall-rs/lsm-tree.git

- name: Fetch upstream and origin
run: |
git fetch upstream main
git fetch origin main

- name: Check for new upstream commits
id: check
run: |
BEHIND=$(git rev-list origin/main..upstream/main --count)
echo "behind=$BEHIND" >> "$GITHUB_OUTPUT"
echo "Commits behind upstream: $BEHIND"

- name: Try merge and create PR or issue
if: steps.check.outputs.behind > 0
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
BEHIND: ${{ steps.check.outputs.behind }}
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"

SYNC_BRANCH="chore/upstream-sync-$(date +%Y%m%d)"
git checkout -b "$SYNC_BRANCH" origin/main

if git merge --no-commit --no-ff upstream/main 2>&1; then
git commit -m "chore: sync upstream ($BEHIND new commits)"
git push origin "$SYNC_BRANCH"

gh pr create \
--title "chore: sync upstream ($BEHIND new commits)" \
--body "$(cat <<'EOF'
## Upstream Sync

Automated sync from [fjall-rs/lsm-tree](https://github.com/fjall-rs/lsm-tree) main branch.

**Commits behind:** ${{ steps.check.outputs.behind }}

### Review checklist
- [ ] Review upstream changes for breaking modifications
- [ ] Verify our patches still apply cleanly
- [ ] Run full test suite
EOF
)" \
--base main \
--head "$SYNC_BRANCH"
else
CONFLICTS=$(git diff --name-only --diff-filter=U 2>/dev/null || true)
git merge --abort

gh issue create \
--title "Upstream sync conflict ($BEHIND new commits)" \
--body "$(cat <<EOF
## Upstream Sync Conflict

Automated merge from [fjall-rs/lsm-tree](https://github.com/fjall-rs/lsm-tree) main branch failed due to conflicts.

**Commits behind:** $BEHIND

### Conflicting files
\`\`\`
$CONFLICTS
\`\`\`

### Resolution
Manual merge required:
\`\`\`bash
git remote add upstream https://github.com/fjall-rs/lsm-tree.git
git fetch upstream main
git checkout -b chore/upstream-sync main
git merge upstream/main
# Resolve conflicts, then push
\`\`\`
EOF
)" \
--label "upstream-sync"
fi

- name: Check for merged feature branches
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
git fetch upstream main

for ref in $(git for-each-ref --format='%(refname:short)' refs/remotes/origin/feat/#* refs/remotes/origin/fix/#*); do
BRANCH="${ref#origin/}"
BRANCH_TIP=$(git rev-parse "$ref")

if git merge-base --is-ancestor "$BRANCH_TIP" upstream/main 2>/dev/null; then
echo "Branch '$BRANCH' is fully merged into upstream/main"

EXISTING=$(gh issue list --search "Upstream merged: $BRANCH" --state open --json number --jq 'length')
if [ "$EXISTING" = "0" ]; then
gh issue create \
--title "Upstream merged: $BRANCH" \
--body "$(cat <<EOF
Branch \`$BRANCH\` has been fully merged into upstream's main branch.

This branch can likely be deleted. Please verify and clean up.

**Branch tip:** \`$BRANCH_TIP\`
EOF
)" \
--label "upstream-sync"
fi
fi
done
7 changes: 7 additions & 0 deletions .release-plz.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[workspace]
# Don't publish to crates.io — we're a maintained fork using git dependencies
publish = false
# Create GitHub releases with changelog
git_release_enable = true
# Use conventional commits for changelog
changelog_update = true
Loading