On Fri, Nov 29, 2019 at 12:24 AM Pavel Roskin <plroskin@xxxxxxxxx> wrote: > > Hi! > > I've discovered an issue with "git rebase" producing a subtly > incorrect file. In fact, that files even compiled but failed in unit > tests! That's so scary that I'm going to stop using "git rebase" for > now. Fortunately, "git rebase --merge" is working correctly, so I'll > use it. Too bad there is no option to use "--merge" by default. Indeed. We really should fix that, if not just make it the default for everyone. > The issue was observed in git 2.23 and reproduced in today's next > branch (2.24.0.449.g4c06f74957) on up-to-date Fedora 31 x86_64. > > I've created a repository that demonstrates the issue: > https://github.com/proski/git-rebase-demo > > The branch names should be self-explanatory. "master" is the base, > "branch1" and "branch2" contain one change each. If "branch1" is > rebased on top of "branch2", the result is incorrect, saved in the > "rebase-bad" branch. If "git rebase -m" is used, the result is > correct, saved in the "merge-good" branch. > > The files in "rebase-bad" and "merge-good" have exactly the same lines > but in a different order. Yet the changes on branch1 and branch2 > affect non-overlapping parts of the file. There should be no doubt how > the merged code should look like. > > I believe the change on branch2 shifts the lines, so that the first > change from branch1 is applies to a place below the intended location, > and then git goes back to an earlier line to apply the next hunk. I > can imagine that it would do the right thing in case of swapped blocks > of code. Yet I have a real life example where it does a very wrong > thing. > > Indeed, "git diff origin/branch2 origin/rebase-bad" and "git diff > origin/branch2 origin/merge-good" both produce diffs of 9957 bytes > long, different only in the order of the hunks. > > Another interesting data point - "git rebase --interactive" is working > correctly. Thanks for the detailed report and simple testcase. Turns out the --interactive isn't so interesting, because a few cycles back we re-implemented the --merge behavior on top of the interactive machinery so the two use the exact same engine. Anyway, I can duplicate the problem and noticed a few interesting things. Since the am-backend for rebase (the default) basically just uses diff and apply, I tried duplicating with just those after looking at things and noticing that it appeared to be applying patch hunks on the wrong lines: $ git switch branch2 $ git reset --hard origin/branch2 HEAD is now at 1331204 Change on branch 2 $ git diff -U3 origin/master origin/branch1 >diff.patch $ git apply diff.patch $ git diff --shortstat origin/merge-good 1 file changed, 43 insertions(+), 43 deletions(-) $ git diff --shortstat origin/rebase-bad So, this reproduces your bad results. Let's repeat with -U4: $ git reset --hard origin/branch2 HEAD is now at 1331204 Change on branch 2 $ git diff -U4 origin/master origin/branch1 >diff.patch $ git apply diff.patch $ git diff --shortstat origin/merge-good 1 file changed, 10 insertions(+), 10 deletions(-) $ git diff --shortstat origin/rebase-bad 1 file changed, 37 insertions(+), 37 deletions(-) That gives us a result that matches neither merge-good nor rebase-bad, but is closer to the good side. Let's try again with -U5: $ git reset --hard origin/branch2 HEAD is now at 1331204 Change on branch 2 $ git diff -U5 origin/master origin/branch1 >diff.patch $ git apply diff.patch $ git diff --shortstat origin/merge-good $ git diff --shortstat origin/rebase-bad 1 file changed, 43 insertions(+), 43 deletions(-) Ahah! With five lines of context, git diff & git apply can produce the correct result. Sadly, I tried to force this with git rebase, but -C5 only affected the apply side and there's no option to pass to rebase to pass through -U5 to the diff logic. Also, although there is a diff.context config option, git-am ignores it (Note that git_am_config() does not directly check that value and it calls git_default_config(), not git_diff_ui_config() or even git_diff_basic_config()). So, it's not possible to force the am-based rebase to get the right answer currently even if you figure out what the problem is. The merge-based rebase, by contrast, essentially benfits from having the entire files of each version accessible so it automatically gets it right. So, to summarize here: * you have a case where the default 3 lines of context mess stuff up; but rebase --merge works great * am doesn't have a -U option, and ignores the diff.context setting, making it impossible to force the am backend to work on your case and also: * rebase doesn't have an option to use the merge/interactive backend by default (nor an --am option to override it) Also, * The performance of the merge/interactive backend is slightly better than the am-backend (https://public-inbox.org/git/CABPp-BF=ev03WgODk6TMQmuNoatg2kiEe5DR__gJ0OTVqHSnfQ@xxxxxxxxxxxxxx/) and will be getting better * The merge/interactive backend supports many more options than the am-backend, though the am one still has a few the merge backend doesn't. Once the ra/rebase-i-more-options topic merges, --whitespace will be the only consequential option that the am-backend supports that the merge/interactive-backend doesn't. (There's also -C, but as noted above, the merge/interactive backend already have access to the full file). Maybe we should just switch the default, for everyone? (And provide an --am option to override it and a config setting to get the old default?)