Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Buga,

thank you for making this a lot more understandable to this thick
developer.

On Tue, 27 Feb 2018, Igor Djordjevic wrote:

> On 27/02/2018 19:55, Igor Djordjevic wrote:
> > 
> > It would be more along the lines of "(1) rebase old merge commit parents, 
> > (2) generate separate diff between old merge commit and each of its 
> > parents, (3) apply each diff to their corresponding newly rebased 
> > parent respectively (as a temporary commit, one per rebased parent), 
> > (4) merge these temporary commits to generate 'rebased' merge commit, 
> > (5) drop temporary commits, recording their parents as parents of 
> > 'rebased' merge commit (instead of dropped temporary commits)".
> > 
> > Implementation wise, steps (2) and (3) could also be done by simply 
> > copying old merge commit _snapshot_ on top of each of its parents as 
> > a temporary, non-merge commit, then rebasing (cherry-picking) these 
> > temporary commits on top of their rebased parent commits to produce 
> > rebased temporary commits (to be merged for generating 'rebased' 
> > merge commit in step (4)).
> 
> For those still tagging along (and still confused), here are some 
> diagrams (following what Sergey originally described). Note that 
> actual implementation might be even simpler, but I believe it`s a bit 
> easier to understand like this, using some "temporary" commits approach.
> 
> Here`s our starting position:
> 
> (0) ---X1---o---o---o---o---o---X2 (master)
>        |\
>        | A1---A2---A3
>        |             \
>        |              M (topic)
>        |             /
>        \-B1---B2---B3
> 
> 
> Now, we want to rebase merge commit M from X1 onto X2. First, rebase
> merge commit parents as usual:
> 
> (1) ---X1---o---o---o---o---o---X2
>        |\                       |\
>        | A1---A2---A3           | A1'--A2'--A3'
>        |             \          |
>        |              M         |
>        |             /          |
>        \-B1---B2---B3           \-B1'--B2'--B3'
> 
> 
> That was commonly understandable part.

Good. Let's assume that I want to do this interactively (because let's
face it, rebase is boring unless we shake up things a little). And let's
assume that A1 is my only change to the README, and that I realized that
it was incorrect and I do not want the world to see it, so I drop A1'.

Let's see how things go from here:

> Now, for "rebasing" the merge commit (keeping possible amendments), we
> do some extra work. First, we make two temporary commits on top of old
> merge parents, by using exact tree (snapshot) of commit M:
> 
> (2) ---X1---o---o---o---o---o---X2
>        |\                       |\
>        | A1---A2---A3---U1      | A1'--A2'--A3'
>        |             \          |
>        |              M         |
>        |             /          |
>        \-B1---B2---B3---U2      \-B1'--B2'--B3'

Okay, everything would still be the same except that I still have dropped
A1'.

> So here, in terms of _snapshots_ (trees, not diffs), U1 = U2 = M.
> 
> Now, we rebase these temporary commits, too:
> 
> (3) ---X1---o---o---o---o---o---X2
>        |\                       |\
>        | A1---A2---A3---U1      | A1'--A2'--A3'--U1'
>        |             \          |
>        |              M         |
>        |             /          |
>        \-B1---B2---B3---U2      \-B1'--B2'--B3'--U2'

I still want to drop A1 in this rebase, so A1' is still missing.

And now it starts to get interesting.

The diff between A3 and U1 does not touch the README, of course, as I said
that only A1 changed the README.  But the diff between B3 and U2 does
change the README, thanks to M containing A1 change.

Therefore, the diff between B3' and U2' will also have this change to the
README. That change that I wanted to drop.

> As a next step, we merge these temporary commits to produce our
> "rebased" merged commit M:
> 
> (4) ---X1---o---o---o---o---o---X2
>        |\                       |\
>        | A1---A2---A3---U1      | A1'--A2'--A3'--U1'
>        |             \          |                  \
>        |              M         |                   M'
>        |             /          |                  /
>        \-B1---B2---B3---U2      \-B1'--B2'--B3'--U2'

And here, thanks to B3'..U2' changing the README, M' will also have that
change that I wanted to see dropped.

Note that A1' is still dropped in my example.

> Finally, we drop temporary commits, and record rebased commits A3' 
> and B3' as our "rebased" merge commit parents instead (merge commit 
> M' keeps its same tree/snapshot state, just gets parents replaced):
> 
> (5) ---X1---o---o---o---o---o---X2
>        |\                       |\
>        | A1---A2---A3---U1      | A1'--A2'--A3'
>        |             \          |             \
>        |              M         |              M'
>        |             /          |             /
>        \-B1---B2---B3---U2      \-B1'--B2'--B3'

Now, thanks to U2' being dropped (and A1' *still* being dropped), the
change in the README that is still in M' is really only in M'. No other
rebased commit has it. That makes it look as if M' introduced this change
in addition to the changes that were merged between the merge parents.

This is what an "evil merge" is: it does more than just combine the
previously diverging branches. It introduces another change that was not
in any non-merge commits before it.

Sometimes, such an "evil merge" is necessary. For example, when master
converted a couple more `hash` references to `oid` including, say, in a
function signature, and the merged branch contains a new caller using that
old function signature: in this case, the merge commit must convert those
`hash` references to `oid` references, or the code won't compile.

In my example, where I dropped A1' specifically so that that embarrasingly
incorrect change to the README would not be seen by the world, though, the
evil merge would be truly evil: it would show said change to the world.
The exact opposite of what I wanted.

> And that`s it, our merge commit M has been "rebased" to M' :)
> 
> (6) ---X1---o---o---o---o---o---X2 (master)
>                                 |\
>                                 | A1'--A2'--A3'
>                                 |             \
>                                 |              M' (topic)
>                                 |             /
>                                 \-B1'--B2'--B3'
> 
> 
> Important thing to note here is that in our step (3) above, still in 
> terms of trees/snapshots (not diffs), U1' could still be equal to 
> U2', produced merge commit M' tree thus being equal to both of them 
> as well (merge commit introducing no changes to either of its 
> parents, originally described by Sergey as "angel merge").
> 
> But it doesn`t have to be so - if any of the rebased commits A1 to A3 
> or B1 to B3 was dropped or modified (or extra commits added, even), 
> that would influence the trees (snapshots) produced after rebasing U1 
> and U2 to U1' and U2', final merge M' reflecting all these changes as 
> well, besides keeping original merge commit M amendments (preserving 
> "evil merge").

Sadly, it would also introduce evil merges in certain circumstances, as I
demonstrated above.

> Well, that`s some theory, now to hopefully confirm/test/polish all 
> this... or trash it, if flawed beyond correction :P

It would have been nice to have such a simple solution ;-)

So the most obvious way to try to fix this design would be to recreate the
original merge first, even with merge conflicts, and then trying to use the
diff between that and the actual original merge commit. In your example,
this would look like that:

---X1---o---o---o---o---o---X2
   |\                       |\
   | A1---A2---A3--         | A1'--A2'--A3'
   |             \ \        |
   |              M M²      |
   |             / /        |
   \-B1---B2---B3--         \-B1'--B2'--B3'

Note that M² would be generated somewhat like this: `git checkout
A3; git merge -s recursive B3; git add -u; git commit`. If there are merge
conflicts, then the result would include the conflict markers.

Now we would generate something similar to those U1/U2 commits: a
single-parent commit R that reflects the diff between M and M²:

---X1---o---o---o---o---o---X2
   |\                       |\
   | A1---A2---A3           | A1'--A2'--A3'
   |             \          |
   |              M---R     |
   |             /          |
   \-B1---B2---B3           \-B1'--B2'--B3'

Note that the tree of R is identical to the tree of M². We can now
proceed to generate the merge between A3' and B3' (possibly with merge
conflicts) and then reverting R on top:

---X1---o---o---o---o---o---X2
   |\                       |\
   | A1---A2---A3           | A1'--A2'--A3'
   |             \          |              \
   |              M---R     |               M'---M³
   |             /          |              /
   \-B1---B2---B3           \-B1'--B2'--B3'

Of course, as before, the idea would be to squash the reverted
changes into the final merge commit.

Now, would this work?

I doubt it, for at least two reasons:

- if there are merge conflicts between A3/B3 and between A3'/B3', those
  merge conflicts will very likely look very different, and the conflicts
  when reverting R will contain those nested conflicts: utterly confusing.
  And those conflicts will look even more confusing if a patch (such as
  A1') was dropped during an interactive rebase.

- One of the promises was that the new way would also handle merge
  strategies other than recursive. What would happen, for example, if M
  was generated using `-s ours` (read: dropping the B* patches' changes)
  and if B1 had been cherry-picked into the history between X1..X2?

  Reverting R would obviously revert those B1 changes, even if B1' would
  obviously not even be part of the rebased history!

Yes, I agree that this `-s ours` example is quite concocted, but the point
of this example is not how plausible it is, but how easy it is to come up
with a scenario where this design to "rebase merge commits" results in
very, very unwanted behavior.

But maybe I missed something obvious, and the design can still be fixed
somehow?

Ciao,
Johannes

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux