On Thu, Apr 12, 2018 at 3:02 PM, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: > Hi Jake, > > On Thu, 12 Apr 2018, Jacob Keller wrote: > >> On Wed, Apr 11, 2018 at 10:42 PM, Sergey Organov <sorganov@xxxxxxxxx> wrote: >> > >> > Jacob Keller <jacob.keller@xxxxxxxxx> writes: >> >> On Wed, Apr 11, 2018 at 6:13 AM, Sergey Organov <sorganov@xxxxxxxxx> wrote: >> >>> It was rather --recreate-merges just a few weeks ago, and I've seen >> >>> nobody actually commented either in favor or against the >> >>> --rebase-merges. >> >>> >> >>> git rebase --rebase-merges >> >> >> >> I'm going to jump in here and say that *I* prefer --rebase-merges, as >> >> it clearly mentions merge commits (which is the thing that changes). >> > >> > OK, thanks, it's fair and the first argument in favor of >> > --rebase-merges I see. >> >> I'd be ok with "--keep-merges" also. > > My main argument against --keep-merges is that there is no good short > option for it: -k and -m are already taken. And I really like my `git > rebase -kir` now... > Right, that's unfortunate. > A minor argument in favor of `--rebase-merges` vs `--keep-merges` is that > we do not really keep the merge commits, we rewrite them. In the version > as per this here patch series, we really create recursive merges from > scratch. I also don't have a strong opinion in regards to --keep vs --rebase.. > > In the later patch series on which I am working, we use a variation of > Phillip's strategy which can be construed as a generalization of the > cherry-pick to include merges: for a cherry-pick, we perform a 3-way merge > between the commit and HEAD, with the commit's parent commit as merge > base. With Phillip's strategy, we perform a 3-way merge between the merge > commit and HEAD (i.e. the rebased first parent), with the merge commit's > first parent as merge base, followed by a 3-way merge with the rebased > 2nd parent (with the original 2nd parent as merge base), etc > > However. This strategy, while it performed well in my initial tests (and > in Buga's initial tests, too), *does* involve more than one 3-way merge, > and therefore it risks something very, very nasty: *nested* merge > conflicts. Yea, it could. Finding an elegant solution around this would be ideal! (By elegant, I mean a solution which produces merge conflicts that users can resolve relatively easily). > > Now, I did see nested merge conflicts in the past, very rarely, but that > can happen, when two developers criss-cross merge each others' `master` > branch and are really happy to perform invasive changes that our merge > does not deal well with, such as indentation changes. > > When rebasing a merge conflict, however, such nested conflicts can happen > relatively easily. Not rare at all. Right. This would be true regardless of what strategy we choose, I think. > > I found out about this by doing what I keep preaching in this thred: > theory is often very nice *right* until the point where it hits reality, > and then frequently turns really useless, real quickly. Theoretical > musings can therefore be an utter waste of time, unless accompanied by > concrete examples. Agreed. > > To start, I built on the example for an "evil merge" that I gave already > in the very beginning of this insanely chatty thread: if one branch > changes the signature of a function, and a second branch adds a caller to > that function, then by necessity a merge between those two branches has to > change the caller to accommodate the signature change. Otherwise it would > end up in a broken state. > > In my `sequencer-shears` branch at https://github.com/dscho/git, I added > this as a test case, where I start out with a main.c containing a single > function called core(). I then create one branch where this function is > renamed to hi(), and another branch where the function caller() is added > that calls core(). Then I merge both, amending the merge commit so that > caller() now calls hi(). So this is the main.c after merging: > > int hi(void) { > printf("Hello, world!\n"); > } > /* caller */ > void caller(void) { > hi(); > } > > To create the kind of problems that are all too common in my daily work > (seemingly every time some stable patch in Git for Windows gets > upstreamed, it changes into an incompatible version, causing merge > conflicts, and sometimes not only that... but I digress...), I then added > an "upstream" where some maintainer decided that core() is better called > greeting(), and also that a placeholder function for an event loop should > be added. So in upstream, main.c looks like this: > > int greeting(void) { > printf("Hello, world!\n"); > } > /* main event loop */ > void event_loop(void) { > /* TODO: place holder for now */ > } > > Keep in mind: while this is a minimal example of disagreeing changes that > may look unrealistic, in practice this is the exact type of problem I am > dealing with on a daily basis, in Git for Windows as well as in GVFS Git > (which adds a thicket of branches on top of Git for Windows) and with the > MSYS2 runtime (where Git for Windows stacks patches on top of MSYs2, which > in turn maintains their set of patches on top of the Cygwin runtime), and > with BusyBox, and probably other projects I forgot spontaneously. This > makes me convinced that this is the exact type of problem that will > challenge whatever --rebase-merges has to deal with, or better put: what > the user of --rebase-merges will have to deal with. > > (If I got a penny for every merge conflict I resolved, where test cases > were appended to files in t/, I'd probably be rich by now. Likewise, the > `const char *` -> `struct object_oid *` conflicts have gotten to a point > where I can resolve them while chatting to somebody.) > > Now, rebasing the original patches above (renaming core() to hi(), and > adding caller()) will obviously conflict with those upstream patches > (renaming core() to greeting(), and adding event_loop()). That cannot be > avoided. In the example above, I decided to override upstream's decision > by insisting on the name hi(), and resolving the other merge conflict by > adding *both* event_loop() and caller(). > > The big trick, now, is to avoid forcing the user to resolve the same > conflicts *again* when the merge commit is rebased. The better we can help > the user here, the more powerful will this mode be. > > But here, Phillip's strategy (as implemented by yours truly) runs this > problem: > > int hi(void) { > printf("Hello, world!\n"); > } > <<<<<<< intermediate merge > <<<<<<< HEAD > /* main event loop */ > void event_loop(void) { > /* TODO: place holder for now */ > ======= > ======= > } > >>>>>>> <HASH>... merge head #1 > /* caller */ > void caller(void) { > hi(); > >>>>>>> <HASH>... original merge > } > > Now, no matter who I ask, everybody so far agreed with me that this looks > bad. Like, really bad. There are two merge conflicts, obviously, but it is > not even clear which conflict markers belong together! > > It gets a little better when I take a page out of recursive merge's > playbook, which uses different marker sizes for nested merge conflicts > (which I of course implemented and pushed to `sequencer-shears`, currently > still in an unpolished state): > > int hi(void) { > printf("Hello, world!\n"); > } > <<<<<<< intermediate merge > <<<<<<<< HEAD > /* main event loop */ > void event_loop(void) { > /* TODO: place holder for now */ > ======== > ======= > } > >>>>>>> <HASH>... merge head #1 > /* caller */ > void caller(void) { > hi(); > >>>>>>>> <HASH>... original merge > } > > At least now we understand which conflict markers belong together. But I > still needed to inspect the intermediate states to understand what is > going on: > > After the first 3-way merge (the one between the original merge commit and > HEAD), we have the conflict markers around event_loop() and caller(), > because they had been added into the same spot. > > The second 3-way merge would also want to add the event_loop(), but not > caller(), so ideally it should see that event_loop() is already there and > not add any conflict markers. But that is not the case: event_loop() was > added *with conflict markers*. > > So those conflict markers in the first 3-way merge *cause* the conflicts > in the second 3-way merge! > > And indeed, if we merge the other way round (original merge with 2nd > parent, then with 1st parent), the result looks much better: > > int hi(void) { > printf("Hello, world!\n"); > } > /* main event loop */ > void event_loop(void) { > /* TODO: place holder for now */ > } > <<<<<<<< HEAD > ======== > /* caller */ > void caller(void) { > hi(); > } > >>>>>>>> <HASH>... intermediate merge > > So: the order of the 3-way merges does matter. > > I did implement this, too, in the `sequencer_shears` branch: if the first > 3-way merge causes conflicts, attempt the second one, and if that one is > clean, try merging that merge result into HEAD (forgetting about the first > attempted 3-way merge). > > That is still unsatisfying, though, as it is easy to come up with a > main2.c in the above example that requires the *opposite* merge order to > avoid nested conflicts. I agree, this solution won't work reliably because we can show examples which fail the opposite way. > > The only way out I can see is to implement some sort of "W merge" or > "chandelier merge" that can perform an N-way merge between one revision > and N-1 other revisions (each of the N-1 bringing its own merge base). I > call them "W" or "chandelier" because such a merge can be visualized by > the original merge commit being the center of a chandelier, and each arm > representing one of the N-1 merge heads with their own merge bases. > I think this approach sounds reasonable. > Similar to the 3-way merge we have implemented in xdiff/xmerge.c, this > "chandelier merge" would then generate the two diffs between merge base > and both merge heads, except not only one time, but N-1 times. It would > then iterate through all hunks ordered by file name and line range. Any > hunk without conflicting changes would be applied as-is, and the remaining > ones be turned into conflicts (handling those chandelier arms first where > both diffs' hunks look identical). > > Have I missed any simpler alternative? I *think* this would work well if I understand it, but it's difficult to process without examples. > > Ciao, > Johannes