Re: [RFC v2] Rebasing merges: a jorney to the ultimate solution (Road Clear)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Johannes,

On 07/03/2018 15:08, Johannes Schindelin wrote:
> 
> > > Didn't we settle on Phillip's "perform successive three-way merges
> > > between the original merge commit and the new tips with the old tips
> > > as base" strategy?
> >
> > It seems you did, dunno exactly why.
> 
> That is not true. You make it sound like I was the only one who liked
> this, and not Phillip and Buga, too.

For myself, I do actually favor Sergey`s approach in general, but 
_implemented_ through what Phillip described (or a mixture of both, to 
be precise). But, let me explain... :)

> > The main problem with this decision is that we still don't see how and
> > when to stop for user amendment using this method. OTOH, the original
> > has this issue carefully discussed.
> 
> Why would we want to stop, unless there are merge conflicts?

Because we can reliably know that something "unusual" happened - and 
by that I don`t necessarily mean "wrong", but just might be worth 
user inspection.

For example, situation like this (M is made on A3 with `-s ours`, 
obsoleting Bx commits):

(1) ---X8--X9 (master)
       |\
       | A1---A2---A3
       |             \
       |              M (topic)
       |             /
       \-B1---B2---B3

... where we want to rebase M onto X9 is what I would call "usual 
stuff", but this situation (M is still made on A3 with `-s ours`, 
obsoleting Bx commits, but note cherry-picked B2'):

(2) ---X8--B2'--X9 (master)
       |\
       | A1---A2---A3
       |             \
       |              M (topic)
       |             /
       \-B1---B2---B3

... where we still want to rebase M onto X9 is what we might consider 
"unusual", because we noticed that something that shouldn`t be part 
of the rebased merge commit (due to previous `-s ours`) actually got 
in there (due to later cherry-pick), and just wanting the user to 
check and confirm.

This is the major reason why I would prefer Sergey`s approach in 
general... and might be I also have a good explanation on *why* it 
works, but let`s get there ;)

(p.s. This is only one, but certainly not the only case)

> > "rebase sides of the merge commit and then three-way merge them back
> > using original merge commit as base"
> 
> And that is also wrong, as I had proved already! Only Buga's addition made
> it robust against dropping/modifying commits, and that addition also makes
> it more complicated.

No, this is actually right, that sentence nicely describing _how_ it 
works. That addition of mine was just including the correct merge 
base (being the original merge commit that we are "rebasing"), and 
that`s what Sergey is talking about.

> And it still has no satisfactory simple explanation why it works.

Getting there... :)

> > > - it is *very easy* to reason about, once it is pointed out that
> > > rebases and merges result in the same trees.
> >
> > The original is as easy to reason about, if not easier, especially as
> > recursive merge strategy is not being used there in new ways.
> 
> So do it. I still have to hear a single-sentence, clear and obvious
> explanation why it works.
> 
> And please do not describe why your original version works, because it
> does not work. Describe why the one amended with Buga's hack works.
> 
> > I honestly don't see any advantages of Phillip's method over the
> > original, except personal preferences. At the same time, I have no
> > objection of using it either, provided consistency check problem is
> > solved there as well.
> 
> Okay, let me reiterate then, because I do not want this point to be
> missed:
> 
> Phillip's method is essentially merging the new tips into the original
> merge, pretending that the new tips were not rebased but merged into
> upstream.
> 
> So it exploits the duality of the rebase and merge operation, which both
> result in identical trees (potentially after resolving merge conflicts).
> 
> I cannot think of any such interpretation for your proposal augmented by
> Buga's fix-ups. And I haven't heard any such interpretation from your
> side, either.

Ok here it goes (I might be still wrong, but please bare with me).

What Sergey originally described also uses the "duality of the rebase 
and merge operation", too ;) Just in a bit different way (and might 
be a more straightforward one?).

Here`s a starting point, two commits A and B, merged into M:

(3) ---A
        \
         M
        /
    ---B


According the "patch theory"[1] (which might not be too popular 
around here, but should serve the purpose for what I`m trying to 
explain), each merge commit can be "transformed" to two non-merge 
commits, one on top of each of the merge parents, where new commit 
brings its original merge parent commit tree to the state of the 
merge commit tree:

(4) ---A---U1



    ---B---U2


Now, we have two new commits, U1 and U2, each having the same tree as 
previous merge commit M, but representing changes in regards to 
specific parents - and this is essentially what Sergey`s original 
approach was using (whether he knew it, or not).

When it comes to rebasing, it`s pretty simple, too. As this:

(5) ---X1---o---o---o---o---o---X2 (master)
       |\
       | A1---A2---A3
       |             \
       |              M
       |             /
       \-B1---B2---B3

... actually equals this:

(6) ---X1---o---o---o---o---o---X2 (master)
       |\
       | A1---A2---A3---U1
       |
       |
       |
       \-B1---B2---B3---U2

... where trees of M, U1 and U2 are same, and we can use the regular 
rebase semantics and rebase it to this:

(7) ---X1---o---o---o---o---o---X2 (master)
                                |\
                                | A1'--A2'--A3'--U1'
                                |
                                |
                                |
                                \-B1'--B2'--B3'--U2'

... which is essentially this again:

(8) ---X1---o---o---o---o---o---X2 (master)
                                |\
                                | A1'--A2'--A3'
                                |            \
                                |             M'
                                |            /
                                \-B1'--B2'--B3'

... where it is still true that trees of U1', U2' and M' are still 
the same. So we managed to rebase a merge commit without ever doing a 
merge :) (note that, practically, we _can_ finally even merge U1' and 
U2' to produce M', it shouldn`t really matter as all the trees are 
the same, so merge will be a no-op)

But, as we saw (what you`ve explained), this doesn`t really work in 
case one of the sides of the merge gets "disbalanced" (so to say), 
like dropping a commit (which could also happen non-interactively, 
where a commit has been cherry-picked to a different branch, but
previously obsoleted by `-s ours` merge).

As observed, dropped commit could still wrongly get into final merge 
commit tree (or cherry-picked commit wrongly not get there), due to 
the nature of those rebased U1 and U2 temporary commits to hold all 
the differences in regards to their specific merge commit parent.

A natural improvement to original idea which Sergey eventually came 
up with after my findings (which you now ended up calling a hack, even, 
but I would respectfully disagree), is to use original merge commit M 
as a merge base for final merge of U1' and U2' - and here is why it 
works, too, and why I wouldn`t consider it a hack, but a proper (and 
a solid) solution.

Merge commit M is what we are rebasing, so we should have a way to 
somehow learn from it, alongside U1 and U2 temporary commits - and 
that is what using original merge commit M as a base does for us. It 
helps us spot any "disbalances" that happened in comparison to original 
merge parent trees, which is exactly what we`re interested in.

In ideal case (as per "patch theory"[1]), final merge of rebased U1' 
and U2' with original merge commit M as a base would be rather simple, 
too (not to say pointless again), as both U1' and U2' would have same 
changes in comparison to M, namely all the changes from commit we are 
rebasing from (X1 above) to commit we are rebasing to (X2 above).

But in a more complex case like this:

(9) ---X1---B2'---o---o---o---o---X2 (master)
                                  |\
                                  | A12--A2'---B3'
                                  |             \
                                  |              M'
                                  |             /
                                  \-B1'--B3'---B4

..., being what we are ultimately interested in, it helps us notice 
all kind of changes on our rebased merge parent branches, and act 
accordingly (as shown by my previously posted test script[2]).

All this said (and hoping anyone is still reading this), to shed some 
light on what I meant by "favoring Sergey`s approach in general, but 
_implemented_ through what Phillip described".

I think we should use temporary commits U1 and U2, being a sound 
approach backed up by the "patch theory"[1], as it helps us notice 
any "disbalances" of the rebased merge commit parents trees (so we 
can stop for user inspection), finally merging them with original 
merge commit as a base, in order to spot additional changes on trees 
of the merge commit parents we are rebasing.

But, the merging steps themselves, backed up by a need to stop for 
conflict resolution as soon as one happens, should be performed 
_iteratively_, like Phillip showed us... I would think :)

And, for example, that would do wonders when we introduce completely 
new branch during an interactive rebase, too, for example:

(10) ---X1---B2'---o---o---o---o---X2 (master)
        |\                        /|\
        | A1---A2---A3---U1       || A12--A2'---B3'---U1'
        |             \           ||             \
        |              M          ||              M'
        |             /           ||             /|
        \-B1---B2---B3---U2       |\---B3'---B4-/-|---U2'
                                  |               |
                                  \-----B1'-------/


In this situation, we would merge all temporary Ux commits using 
original merge M as a merge base, and then merge _that_ resulting 
tree with B1' (being a newly introduced merge commit parent), using 
whatever merge base is most appropriate between the two (in this 
case, X2).

So, for diagram (10) above, I guess something like this:

	git merge-recursive U1' -- M U2'
	tree="$(git write-tree)"
	# in case of original merge being octopus, we would continue like:
	# git merge-recursive $tree -- M U3'
	# tree="$(git write-tree)"
	# git merge-recursive $tree -- M U4'
	# ... and so on, then finally:
	git merge-recursive $tree -- "$(git merge-base U1' U2' B1')" B1'
	# in more general case, it would be:
	# git merge-recursive $tree -- "$(git merge-base <all-parents-of-new-merge-commit>)" B1'
	tree="$(git write-tree)"
	git tag M' "$(git log --pretty=%B -1 M | git commit-tree $tree -p B3' -p B4 -p B1')"

... where we would stop for conflict resolution after each failing 
merge-recursive attempt (which is also true for producing rebased U1' 
and U2' in the first place).

There, I would still need to make a test script doing this, but I 
hope the concept is clear, and I`m not missing something crucial here.

Comments welcome :)

Regards, Buga

[1] https://en.wikibooks.org/wiki/Understanding_Darcs/Patch_theory
[2] https://public-inbox.org/git/f1a960dc-cc5c-e7b0-10b6-39e5516655b3@xxxxxxxxx/



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux