Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,7 @@ members = [
"gix-hash",
"gix-validate",
"gix-ref",
"gix-reftable",
"gix-command",
"gix-config",
"gix-config-value",
Expand Down
104 changes: 104 additions & 0 deletions PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Reconciled Plan: Reftable Port + Integration

## Branch Reality
As of 2026-03-18, branch `codex/reftable-port-sequence` does not match the original "one commit per step" execution plan.

- The branch contains one reftable-only squash commit: `94793bb6fb` from 2026-03-03.
- That commit sits on top of `e8bf096c07`, which was `main` on 2026-03-03.
- Current `origin/main` is `8e47e0f00b`, so `git diff origin/main..HEAD` mixes this branch's work with unrelated upstream changes.
- To inspect only this branch's payload, compare `HEAD^..HEAD`.

In other words, this branch currently implements the standalone `gix-reftable` port and tests, but it does not yet contain the planned `gix-ref`/`gix` backend integration work.

## Reconciled Scope
Implemented on this branch:
- workspace wiring for `gix-reftable`
- low-level reftable primitives
- record encoding/decoding
- block, blocksource, and single-table reader support
- merged iteration helpers
- writer support
- stack transactions, compaction, reload, and fsck support
- upstream-style `u-reftable-*` parity tests
- selected `t0610`/`t0613`/`t0614` behavior tests

Not implemented on this branch:
- backend-agnostic `gix-ref` store activation
- reftable-backed `gix-ref` adapter
- `gix` repository opening and runtime support for reftable refs
- cross-backend regression coverage for the integrated path
- user-facing documentation of landed support

## Planned Sequence With Current Status
1. **`workspace: add gix-reftable crate skeleton and wire it into Cargo workspace`**
Status: completed, but folded into squash commit `94793bb6fb`.

2. **`gix-reftable: port basics/constants/error/varint primitives from git/reftable`**
Status: completed, but folded into squash commit `94793bb6fb`.

3. **`gix-reftable: implement record model and encode/decode parity (ref/log/obj/index)`**
Status: completed, but folded into squash commit `94793bb6fb`.

4. **`gix-reftable: implement block + blocksource + table reader`**
Status: completed, but folded into squash commit `94793bb6fb`.

5. **`gix-reftable: implement merged table iterators, pq, and tree helpers`**
Status: completed, but folded into squash commit `94793bb6fb`.

6. **`gix-reftable: implement writer with limits/index emission/write options`**
Status: completed, but folded into squash commit `94793bb6fb`.

7. **`gix-reftable: implement stack transactions, auto-compaction, reload, and fsck`**
Status: completed, but folded into squash commit `94793bb6fb`.

8. **`gix-reftable/tests: port upstream u-reftable-* unit suites with 1:1 case mapping`**
Status: completed, but folded into squash commit `94793bb6fb`.

9. **`gix-reftable/tests: add selected t0610/t0613/t0614 behavior parity integration tests`**
Status: completed, but folded into squash commit `94793bb6fb`.

10. **`gix-ref: activate backend-agnostic store abstraction (files + reftable state)`**
Status: not implemented on this branch.

11. **`gix-ref: add reftable-backed store adapter and route find/iter/transaction operations`**
Status: not implemented on this branch.

12. **`gix: switch RefStore to backend-capable store and detect extensions.refStorage=reftable`**
Status: not implemented on this branch.

13. **`gix: make reference iteration/peeling/fetch update paths backend-agnostic`**
Status: not implemented on this branch.

14. **`tests: update reftable open/head expectations and add cross-backend regression coverage`**
Status: not implemented on this branch.

15. **`docs/status: document reftable support, sha256 boundary, and update crate-status`**
Status: not implemented on this branch.

## What Must Happen Next To Match The Original Plan
1. Recreate or rebase this branch on top of current `origin/main` instead of comparing it directly from the old 2026-03-03 base.
2. Decide whether steps 1 through 9 must be restored as nine reviewable commits or can remain as one squash commit with documented scope.
3. Implement steps 10 through 15 as follow-up commits.
4. Update the existing `gix` reftable-open test once end-to-end support is actually present.

## Validation Guidance
For the work already present here, the relevant validation is:
- `gix-reftable` unit and behavior parity suites
- targeted workspace build/test coverage for the new crate wiring

For the remaining planned work, validation should expand to:
- `gix-ref` targeted tests
- `gix` targeted repository/reference tests
- reftable fixture coverage in repository-open and reference workflows

## Commit Message Rule For Remaining Work
Every remaining commit should still include:
- **Why now**
- **What changed**
- **Why this order**
- **What it unlocks next**

## Assumptions
- Source parity target is Git's in-tree reftable C implementation and tests.
- `gix-reftable` supports SHA-1 and SHA-256 in isolation.
- End-to-end `gix` reftable support is still outstanding in this branch until steps 10 through 15 land.
5 changes: 5 additions & 0 deletions gix-reftable/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Changelog

## Unreleased

- Initial crate skeleton.
22 changes: 22 additions & 0 deletions gix-reftable/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
lints.workspace = true

[package]
name = "gix-reftable"
version = "0.0.0"
repository = "https://github.com/GitoxideLabs/gitoxide"
license = "MIT OR Apache-2.0"
description = "Read and write Git reftable storage"
authors = ["Sebastian Thiel <sebastian.thiel@icloud.com>"]
edition = "2021"
include = ["src/**/*", "LICENSE-*"]
rust-version = "1.82"

[lib]
doctest = false
test = true

[dependencies]
crc32fast = "1.5.0"
flate2 = "1.1.5"
gix-hash = { version = "^0.22.1", path = "../gix-hash", features = ["sha1", "sha256"] }
thiserror = "2.0.18"
1 change: 1 addition & 0 deletions gix-reftable/LICENSE-APACHE
1 change: 1 addition & 0 deletions gix-reftable/LICENSE-MIT
171 changes: 171 additions & 0 deletions gix-reftable/src/basics.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
use crate::error::Error;

/// Hash identifiers used by reftable.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Ord, PartialOrd)]
pub enum HashId {
/// SHA-1 object IDs.
Sha1,
/// SHA-256 object IDs.
Sha256,
}

impl HashId {
/// Return the byte-size of object IDs for this hash.
pub const fn size(self) -> usize {
match self {
HashId::Sha1 => 20,
HashId::Sha256 => 32,
}
}

/// Return the [gix_hash::Kind] if this hash ID is supported by `gix-hash`.
pub const fn to_gix(self) -> gix_hash::Kind {
match self {
HashId::Sha1 => gix_hash::Kind::Sha1,
HashId::Sha256 => gix_hash::Kind::Sha256,
}
}
}

/// Return the shared-prefix size between `a` and `b`.
pub fn common_prefix_size(a: &[u8], b: &[u8]) -> usize {
a.iter().zip(b.iter()).take_while(|(a, b)| a == b).count()
}

/// Put a big-endian 64-bit integer into `out`.
pub fn put_be64(out: &mut [u8; 8], value: u64) {
*out = value.to_be_bytes();
}

/// Put a big-endian 32-bit integer into `out`.
pub fn put_be32(out: &mut [u8; 4], value: u32) {
*out = value.to_be_bytes();
}

/// Put a big-endian 24-bit integer into `out`.
pub fn put_be24(out: &mut [u8; 3], value: u32) {
out[0] = ((value >> 16) & 0xff) as u8;
out[1] = ((value >> 8) & 0xff) as u8;
out[2] = (value & 0xff) as u8;
}

/// Put a big-endian 16-bit integer into `out`.
pub fn put_be16(out: &mut [u8; 2], value: u16) {
*out = value.to_be_bytes();
}

/// Read a big-endian 64-bit integer.
pub fn get_be64(input: &[u8; 8]) -> u64 {
u64::from_be_bytes(*input)
}

/// Read a big-endian 32-bit integer.
pub fn get_be32(input: &[u8; 4]) -> u32 {
u32::from_be_bytes(*input)
}

/// Read a big-endian 24-bit integer.
pub fn get_be24(input: &[u8; 3]) -> u32 {
((input[0] as u32) << 16) | ((input[1] as u32) << 8) | (input[2] as u32)
}

/// Read a big-endian 16-bit integer.
pub fn get_be16(input: &[u8; 2]) -> u16 {
u16::from_be_bytes(*input)
}

/// Encode a reftable varint.
///
/// The format is the same as reftable's/ofs-delta's encoding.
pub fn encode_varint(mut value: u64, out: &mut [u8; 10]) -> usize {
let mut tmp = [0u8; 10];
let mut n = 0usize;
tmp[n] = (value & 0x7f) as u8;
n += 1;
while value >= 0x80 {
value = (value >> 7) - 1;
tmp[n] = 0x80 | (value & 0x7f) as u8;
n += 1;
}
// reverse
for (dst, src) in out.iter_mut().take(n).zip(tmp[..n].iter().rev()) {
*dst = *src;
}
n
}

/// Decode a reftable varint from `input`.
///
/// Returns `(value, consumed_bytes)`.
pub fn decode_varint(input: &[u8]) -> Result<(u64, usize), Error> {
if input.is_empty() {
return Err(Error::Truncated);
}
let mut i = 0usize;
let mut c = input[i];
i += 1;
let mut value = u64::from(c & 0x7f);
while c & 0x80 != 0 {
if i >= input.len() {
return Err(Error::Truncated);
}
c = input[i];
i += 1;
value = value
.checked_add(1)
.ok_or(Error::VarintOverflow)?
.checked_shl(7)
.ok_or(Error::VarintOverflow)?
.checked_add(u64::from(c & 0x7f))
.ok_or(Error::VarintOverflow)?;
}
Ok((value, i))
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn hash_sizes() {
assert_eq!(HashId::Sha1.size(), 20);
assert_eq!(HashId::Sha256.size(), 32);
}

#[test]
fn common_prefix() {
assert_eq!(common_prefix_size(b"refs/heads/a", b"refs/heads/b"), 11);
assert_eq!(common_prefix_size(b"x", b"y"), 0);
assert_eq!(common_prefix_size(b"", b"abc"), 0);
}

#[test]
fn be_roundtrip() {
let mut be64 = [0u8; 8];
put_be64(&mut be64, 0x0102_0304_0506_0708);
assert_eq!(get_be64(&be64), 0x0102_0304_0506_0708);

let mut be32 = [0u8; 4];
put_be32(&mut be32, 0x0102_0304);
assert_eq!(get_be32(&be32), 0x0102_0304);

let mut be24 = [0u8; 3];
put_be24(&mut be24, 0x01_02_03);
assert_eq!(get_be24(&be24), 0x01_02_03);

let mut be16 = [0u8; 2];
put_be16(&mut be16, 0x0102);
assert_eq!(get_be16(&be16), 0x0102);
}

#[test]
fn varint_roundtrip() {
let mut storage = [0u8; 10];
for value in [0, 1, 2, 126, 127, 128, 129, 16_384, u32::MAX as u64, u64::MAX] {
let n = encode_varint(value, &mut storage);
let (decoded, consumed) = decode_varint(&storage[..n]).expect("valid");
assert_eq!(consumed, n);
assert_eq!(decoded, value);
}
}
}
Loading
Loading