Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions gix-merge/src/blob/builtin_driver/text/function.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@ use std::ops::Range;
use crate::blob::{
builtin_driver::text::{
utils::{
assure_ends_with_nl, contains_lines, detect_line_ending, detect_line_ending_or_nl, fill_ancestor,
hunks_differ_in_diff3, take_intersecting, tokens, write_ancestor, write_conflict_marker, write_hunks,
zealously_contract_hunks, CollectHunks, Hunk, Side,
assure_ends_with_nl, coalesce_empty_insertions_with_nearest_same_side_hunk, contains_lines,
detect_line_ending, detect_line_ending_or_nl, fill_ancestor, hunks_differ_in_diff3, take_intersecting,
tokens, write_ancestor, write_conflict_marker, write_hunks, zealously_contract_hunks, CollectHunks, Hunk,
Side,
},
Conflict, ConflictStyle, Labels, Options,
},
Expand Down Expand Up @@ -72,6 +73,7 @@ pub fn merge<'a>(
}

hunks.sort_by(|a, b| a.before.start.cmp(&b.before.start));
coalesce_empty_insertions_with_nearest_same_side_hunk(&mut hunks);
let mut hunks = hunks.into_iter().peekable();
let mut intersecting = Vec::new();
let mut ancestor_integrated_until = 0;
Expand Down
54 changes: 54 additions & 0 deletions gix-merge/src/blob/builtin_driver/text/utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -457,6 +457,60 @@ pub fn take_intersecting(
Some(())
}

/// Absorb empty-range insertion hunks into an adjacent same-side non-empty hunk.
///
/// Different diff algorithms (Myers vs Histogram) can produce different hunk boundaries
/// for the same edit. Myers sometimes splits what is logically one change into a deletion
/// plus a separate empty insertion, with one or more unchanged lines between them. When
/// the empty insertion lands at a position that the other side also touches, the merge
/// sees a false conflict.
///
/// This function detects the pattern: a non-empty same-side hunk `H` followed (after at
/// most one unchanged base line) by an empty insertion `I`. It extends `H` to cover the
/// gap and the insertion point, effectively re-joining the split hunk. This makes the
/// merge insensitive to the diff algorithm's alignment choices for these cases.
///
/// Requires `hunks` to be sorted by `before.start`.
pub fn coalesce_empty_insertions_with_nearest_same_side_hunk(hunks: &mut Vec<Hunk>) {
let mut i = 0;
while i < hunks.len() {
let hunk = &hunks[i];
// Only process empty insertions (before range is empty).
if !hunk.before.is_empty() {
i += 1;
continue;
}
let ins_pos = hunk.before.start;
let ins_side = hunk.side;
let ins_after_end = hunk.after.end;

// Look backwards for the nearest same-side non-empty hunk within a gap of ≤ 1 base line.
let mut found = false;
for j in (0..i).rev() {
let candidate = &hunks[j];
if candidate.side != ins_side {
continue;
}
if candidate.before.is_empty() {
// Skip other empty insertions from the same side.
continue;
}
let gap = ins_pos.saturating_sub(candidate.before.end);
if gap <= 1 {
// Extend the candidate to cover the gap and the insertion.
hunks[j].before.end = ins_pos;
hunks[j].after.end = ins_after_end;
hunks.remove(i);
found = true;
}
break;
}
if !found {
i += 1;
}
}
}

pub fn tokens(input: &[u8]) -> imara_diff::sources::ByteLines<'_, true> {
imara_diff::sources::byte_lines_with_terminator(input)
}
Expand Down
64 changes: 64 additions & 0 deletions gix-merge/tests/merge/blob/false_conflict.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
use gix_merge::blob::{builtin_driver, builtin_driver::text::Conflict, Resolution};
use imara_diff::intern::InternedInput;

/// Minimal reproduction: Myers produces a false conflict where git merge-file resolves cleanly.
///
/// base: alpha_x / (blank) / bravo_x / charlie_x / (blank)
/// ours: (blank) / (blank) / bravo_x / charlie_x
/// theirs: alpha_x / (blank) / charlie_x / (blank)
///
/// base→ours: alpha_x deleted (replaced by blank), trailing blank removed
/// base→theirs: bravo_x deleted
///
/// These are non-overlapping changes that git merges cleanly.
/// See https://github.com/GitoxideLabs/gitoxide/issues/2475
#[test]
fn myers_false_conflict_with_blank_line_ambiguity() {
let base = b"alpha_x\n\nbravo_x\ncharlie_x\n\n";
let ours = b"\n\nbravo_x\ncharlie_x\n";
let theirs = b"alpha_x\n\ncharlie_x\n\n";
Comment on lines +17 to +19
Copy link
Copy Markdown

@slarse Simon Larsén (slarse) Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr: I think this should be fixed on the diff-algo side rather than merge-algo side.

If we just add another blank line to each of these revisions, you still get a conflict.

Suggested change
let base = b"alpha_x\n\nbravo_x\ncharlie_x\n\n";
let ours = b"\n\nbravo_x\ncharlie_x\n";
let theirs = b"alpha_x\n\ncharlie_x\n\n";
let base = b"alpha_x\n\n\nbravo_x\ncharlie_x\n\n";
let ours = b"\n\n\nbravo_x\ncharlie_x\n";
let theirs = b"alpha_x\n\n\ncharlie_x\n\n";

This is down to the naive Myer's diff algorithm doing a greedy longest substring match between revisions s.t. the matching intermediate lines get matched. The fix seems to just try to look "one line back" from the hunk, but that just happens to work because there was precisely one line separating these hunks. My edit here separates the hunks by 2 lines instead, making the conflict reappear.

The base-ours diff here is with base Myers this:

-alpha_x

+
 bravo_x
 charlie_x
-

This is natural in Myers as the intermediate matching section is greedily matched. I've validated this output both with an old Myers implementation I wrote myself a few years ago, and with gix. It doesn't matter how many blank lines separate the hunks, they'll be matched all the same, as evidenced by my above tweak to the test case causing it to fail again. So checking just one line back is an edge case fix for an edge case.

This will always conflict with the removal of bravo_x, which looks like this.

 alpha_x

-bravo_x
 charlie_x

Now, Git's Myers implementation is doing something to prioritize hunk cohesion, because its diff output is different:

-alpha_x
+

 bravo_x
 charlie_x
-

This does not conflict with the removal av bravo_x because there's a buffering matched line between the two hunks. I don't know exactly what optimization Git has here, but the point I'm making is that I think this should be a diff-algorithm fix rather than merge-algorithm fix.

As a final note, I think the test case is a bit misleading as it makes this out to be some issue with blank lines in particular, but it's not. It's an issue with matching lines. Replacing the blank lines with bla has the exact same effect.

Suggested change
let base = b"alpha_x\n\nbravo_x\ncharlie_x\n\n";
let ours = b"\n\nbravo_x\ncharlie_x\n";
let theirs = b"alpha_x\n\ncharlie_x\n\n";
let base = b"alpha_x\nbla\nbravo_x\ncharlie_x\n\n";
let ours = b"bla\nbla\nbravo_x\ncharlie_x\n";
let theirs = b"alpha_x\nbla\ncharlie_x\n\n";

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for looking into this Mattias Granlund (@mtsgrd), and particularly to Simon Larsén (@slarse) for the eager review!

Without having even looked at the details, this makes me think that Git might not run into it because it makes diff-slider adjustments, maybe even to the point where it can avoid conflicts, which would also be rather puzzling to me.

Maybe I try to port the diff over to using imara-diff v0.2 which ships with 95% Git-diff-slider compatibility from what we could tell.


let labels = builtin_driver::text::Labels {
ancestor: Some("base".into()),
current: Some("ours".into()),
other: Some("theirs".into()),
};

// Histogram resolves cleanly.
{
let options = builtin_driver::text::Options {
diff_algorithm: imara_diff::Algorithm::Histogram,
conflict: Conflict::Keep {
style: builtin_driver::text::ConflictStyle::Merge,
marker_size: 7.try_into().unwrap(),
},
};
let mut out = Vec::new();
let mut input = InternedInput::default();
let res = builtin_driver::text(&mut out, &mut input, labels, ours, base, theirs, options);
assert_eq!(res, Resolution::Complete, "Histogram should resolve cleanly");
}

// Myers should also resolve cleanly (it used to produce a false conflict because
// imara-diff's Myers splits the ours change into two hunks — a deletion at base[0]
// and an empty insertion at base[2] — and the insertion collided with theirs'
// deletion at base[2]).
{
let options = builtin_driver::text::Options {
diff_algorithm: imara_diff::Algorithm::Myers,
conflict: Conflict::Keep {
style: builtin_driver::text::ConflictStyle::Merge,
marker_size: 7.try_into().unwrap(),
},
};
let mut out = Vec::new();
let mut input = InternedInput::default();
let res = builtin_driver::text(&mut out, &mut input, labels, ours, base, theirs, options);
assert_eq!(
res,
Resolution::Complete,
"Myers should resolve cleanly (git merge-file does). Output:\n{}",
String::from_utf8_lossy(&out)
);
}
}
1 change: 1 addition & 0 deletions gix-merge/tests/merge/blob/mod.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
mod builtin_driver;
mod false_conflict;
mod pipeline;
mod platform;

Expand Down
Loading