Skip to content

fix(jsonl-merge): equal-ts entries must converge across machines#1770

Open
jbetala7 wants to merge 1 commit into
garrytan:mainfrom
jbetala7:oss/fix-jsonl-merge-ts-tiebreak
Open

fix(jsonl-merge): equal-ts entries must converge across machines#1770
jbetala7 wants to merge 1 commit into
garrytan:mainfrom
jbetala7:oss/fix-jsonl-merge-ts-tiebreak

Conversation

@jbetala7
Copy link
Copy Markdown
Contributor

Fixes #1769

Repro / observed problem

bin/gstack-jsonl-merge is the git merge driver for append-only JSONL state
files (telemetry, learnings, timeline). Its header promises deterministic
resolution: "both appends survive, ordered by wall-clock timestamp where
available, content hash otherwise."
That promise breaks when two different
entries share the same ts:

a='{"ts":"2026-05-28T10:00:00Z","event":"a"}'
b='{"ts":"2026-05-28T10:00:00Z","event":"b"}'
printf '%s\n' "$a" > ours; printf '%s\n' "$b" > theirs; : > base
bin/gstack-jsonl-merge base ours theirs && cat ours   # a, b
printf '%s\n' "$b" > ours; printf '%s\n' "$a" > theirs; : > base
bin/gstack-jsonl-merge base ours theirs && cat ours   # b, a  <-- diverges

Root cause

The sort key for a timestamped line was (0, ts) with no further tiebreaker.
Python's sort is stable, so equal-ts entries keep insertion order
(base, ours, theirs). Git assigns the local side to %A ("ours"), so the
same logical merge runs with ours/theirs swapped on the two machines, and
the equal-ts lines come out in opposite order. The merged files diverge and
never converge; the next sync is another conflict, resolved differently again.
gstack-telemetry-log stamps second-granularity timestamps
(date -u +%Y-%m-%dT%H:%M:%SZ), so same-ts collisions are routine. Lines
without a ts were already safe — they order by SHA-256 of the content, which
is side-independent.

Fix

Add the line content as the final sort tiebreaker ((0, ts, line), and
(1, h, line) for the hash path for symmetry). Equal-ts entries now order by
content, identically on every machine. One-line behavioral change to the sort
key.

Testing

New test/jsonl-merge.test.ts (6 tests, bun test):

  • equal-ts entries resolve identically with the two sides swapped (the
    convergence regression — fails on main: emits a,b vs b,a; passes
    with the fix)
  • no-ts and plain non-JSON lines also resolve side-independently
  • exact-duplicate dedup preserved
  • timestamped entries still sort ascending by ts
  • absent ours/theirs files tolerated (added-file merge)

Verified the convergence test fails on origin/main's driver and passes with
the change. bash -n clean, git diff --check clean.

The JSONL append merge driver sorted timestamped entries by (0, ts) with no
further tiebreaker. Equal-ts entries then fell back to stable-sort insertion
order (base, ours, theirs), but git assigns the local side to "ours", so two
machines resolving the same conflict emitted equal-ts lines in opposite order.
The merged files diverged and never converged. gstack-telemetry-log uses
second-granularity timestamps, so same-ts collisions are routine.

Add the line content as the final sort tiebreaker so the order is total and
side-independent. Add a regression test that runs the driver with the two
sides swapped and asserts identical output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gstack-jsonl-merge: equal-ts entries resolve non-deterministically across machines (append-only logs never converge)

1 participant