On Fri, Feb 16, 2018 at 5:08 AM, Sergey Organov <sorganov@xxxxxxxxx> wrote: > Hi, > > By accepting the challenges raised in recent discussion of advanced > support for history rebasing and editing in Git, I hopefully figured out > a clean and elegant method of rebasing merges that I think is "The Right > Way (TM)" to perform this so far troublesome operation. ["(TM)" here has > second meaning: a "Trivial Merge (TM)", see below.] > > Let me begin by outlining the method in git terms, and special thanks > here must go to "Johannes Sixt" <j6t@xxxxxxxx> for his original bright > idea to use "cherry-pick -m1" to rebase merge commits. > > End of preface -- here we go. > I hope to take a more detailed look at this, also possibly with some attempts at re-creating the process by hand to see it in practice. > Given 2 original branches, b1 and b2, and a merge commit M that joins > them, suppose we've already rebased b1 to b1', and b2 to b2'. Suppose > also that B1' and B2' happen to be the tip commits on b1' and b2', > respectively. > > To produce merge commit M' that joins b1' and b2', the following > operations will suffice: > > 1. Checkout b2' and cherry-pick -m2 M, to produce U2' (and new b2'). > 2. Checkout b1' and cherry-pick -m1 M, to produce U1' (and new b1'). > 3. Merge --no-ff new b2' to current new b1', to produce UM'. > 4. Get rid of U1' and U2' by re-writing parent references of UM' from > U1' and U2' to B1' and B2', respectively, to produce M'. > 5. Mission complete. > Seems pretty straight forward, go to each branch and cherry-pick the merge respective to its relative parent, and then finally re-merge everything, and consume the intermittent commits. > Let's now see why and how the method actually works. > > Firs off, let me introduce you to my new friend, the Trivial Merge, or > (TM) for short. By definition, (TM) is a merge that introduces > absolutely no differences to the sides of the merge. (I also like to > sometimes call him "Angel Merge", both as the most beautiful of all > merges, and as direct antithesis to "evil merge".) > > One very nice thing about (TM) is that to safely rebase it, it suffices > to merge its (rebased) parents. It is safe in this case, as (TM) itself > doesn't posses any content changes, and thus none could be missed by > replacing it with another merge commit. > > I bet most of us have never seen (TM) in practice though, so let's see > how (TM) can help us handle general case of some random merge. What I'm > going to do is to create a virtual (TM) and see how it goes from there. > > Let's start with this history: > > M > / \ > B1 B2 > > And let's transform it to the following one, contextually equivalent to > the original, by introducing 2 simple utility commits U1 and U2, and a > new utility merge commit UM: > > UM > / \ > U1 U2 > | | > B1 B2 > > Here content of any of the created UM, U1, and U2 is the same, and is > exact copy of original content of M. I.e., provided [A] denotes > "content of commit A", we have: > > [UM] = [U1] = [U2] = [M] > > Stress again how these changes to the history preserve the exact content > of the original merge ([UM] = [M]), and how U1 an U2 represent content > changes due to merge on either side[*], and how neither preceding nor > subsequent commits content would be affected by the change of > representation. > > Now observe that as [U1] = [UM], and [U2] = [UM], the UM happens to be > exactly our new friend -- the "Trivial Merge (TM)" his true self, > introducing zero changes to content. > > Next we rebase our new representation of the history and we get: > > UM' > / \ > U1' U2' > | | > B1' B2' > > Here UM' is bare merge of U1' and U2', in exact accordance with the > method of rebasing a (TM) we've already discussed above, and U1' and U2' > are rebased versions of U1 and U2, obtained by usual rebasing methods > for non-merge commits. > > (Note, however, that at this point UM' is not necessarily a (TM) > anymore, so in real implementation it may make sense to check if UM' is > not a (TM) and stop for possible user amendment.) > This might be a bit tricky for a user to understand what the process is, especially if they don't understand how it's creating special U1' and U2' commits. However, it *is* the cleanest method I've either seen or thought of for presenting the conflict to the user. > Finally, to get to our required merge commit M', we get the content of > UM' and record two actual parents of the merge: > > M' > / \ > B1' B2' > > Where [M'] = [UM']. > > That's it. Mission complete. > > I expect the method to have the following nice features: > > - it carefully preserves user changes by rebasing the merge commit > itself, in a way that is semantically similar to rebasing simple > (non-merge) commits, yet it allows changes made to branches during > history editing to propagate over corresponding merge commit that joins > the branches, even automatically when the changes don't conflict, as > expected. > Right. > - it has provision for detection of even slightest chances of ending up > with surprising merge (just check if UM' is still (TM)), so that > implementation could stop for user inspection and amendment when > appropriate, yet it is capable of handling trivial cases smoothly and > automatically. Nice! > > - it never falls back to simple invocation of merge operation on rebased > original branches themselves, thus avoiding the problem of lack of > knowledge of how the merge at hand has been performed in the first > place. It doesn't prevent implementation from letting user to manually > perform whatever merge she wishes when suspect result is automatically > detected though. > Right, since we're re-creating the intermittent commits U1' and U2' first based on the original merge, that's how we manage to maintain the result of the merge. I like it. > - it extends trivially to octopus merges. > > - it appears shiny to the point that it will likely be able to handle > even darkest evil merges nicely, no special treatment required. > Yep, and I like that it has a pretty reasonable way of presenting conflicts for resolution. It may be a bit tricky to explain the use of the intermittent commits U1' and U2' though. > Footnote: > > [*] We may as well consider the (UM,U1,U2) trio to be semantically split > representation of git merge commit, where U1 and U2 represent content > changes to the sides, and UM represents pure history joint. Or, the > other way around, we may consider git merge commit to be optimized > representation of this trio. I think this split representation could > help to simplify reasoning about git merges in general. Yes, I think this concept is pretty useful. I think it could be useful as a way of showing how the merge worked. Thanks, Jake > > -- Sergey