Re: Replaying merges

Elijah Newren <newren@xxxxxxxxx> · Fri, 17 May 2024 18:45:32 -0700

Hi Johannes!

On Fri, May 17, 2024 at 5:35 PM Johannes Schindelin
<Johannes.Schindelin@xxxxxx> wrote:
>
> Hi Elijah,
>
> I took the suggestion to heart that you explained a couple of times to me:
> To replay merge commits (including their merge conflict resolutions) by
> using the _remerged_ commit as merge base, the original merge commit as
> merge head, and the newly-created merge (with conflicts and all) as HEAD.
>
> I noodled on this idea a bit until I got it into a usable shape that I
> applied to great effect when working on the recent embargoed releases.
>
> Here it is, the script [*1*] that I used (basically replacing all the
> `merge -C` instances in the rebase script with `replay-merge.sh`):
>
<snip>
> For the most part, this worked beautifully.

Cool to see someone try it out.

> However. The devil lies in the detail.

Yup, but details rather than detail.  ;-)

<snip>
> The biggest complication being the scenario... when a merge
> conflict had been addressed in the original merge commit, but in the
> replayed merge there is no conflict. In such a scenario, this script _will
> create not one, but two merge conflicts, nested ones_!

Only if merge.conflictStyle="diff3"; if merge.conflictStyle="merge",
then there will be no nested conflict (since the nested conflict comes
from the fact that the base version had a conflict itself).

This is one of the issues I noted in my write up a couple years ago:
https://github.com/newren/git/blob/replay/replay-design-notes.txt#L315-L316

Further, it can get worse, since in the current code the inner
conflict from the base merge could be an already arbitrarily nested
merge conflict with N levels (due to recursive merging allowing
arbitrary nested of merge conflicts), giving us an overall nesting of
N+1 merge conflicts rather than just the 2 you assumed.  That's ugly
enough, but we also need to worry about ensuring the conflict markers
from different merges get different conflict marker lengths, which
presents an extra challenge since the outer merge here is not part of
the original recursive merge.

In addition to these challenges, there's some other ones:
  * What about when the remerged commit and the newly-created merge
have the "same" conflict.  Does it actually look the "same" to the
diff machinery so that it can resolve the conflict away to how the
original merge resolved?  (Answer: not with a naive merge of these
three commits; we need to do some extra tweaking.  I'm actually
suprised you said this basic idea worked given this particular
problem.)
  * What about conflicts with binary files?  Or non-textual conflicts
of other types like modify/delete or rename/rename?

> I still do think that your idea has merit, but I fear that it won't ever
> be as easy as performing multiple three-way merges in succession.

I totally agree we need to do more than the simple merge of those
three "commits"; I have ideas for this that address some of the
challenges over at
https://github.com/newren/git/blob/replay/replay-design-notes.txt#L264-L341

> To address the observed problem, the code will always have to be aware of
> unresolved conflicts in the provided merge base, so that it can handle
> them appropriately, and not treat them as plain text, so that no nested
> conflicts need to be created.

I agree we need to handle conflicts specially -- not only in the
provided merge base ('R' in my document) but also in the new merge of
the two parents (what you labelled HEAD and I labelled 'N').

> Unfortunately, I did not document properly in what precise circumstances
> those nested conflicts were generated (I was kind of busy trying to
> coordinate everything around the security bug-fix releases), but I hope to
> find some time soon to do so, and to turn them into a set of test cases
> that we can play with.

Yeah, we'll also need to add testcases for some of the other issues I
point out in that document.

I'm looking forward to my situation changing soon and hopefully
getting more time to work on things like this...