On Wed, Jul 8, 2020 at 5:07 PM Philippe Blain <levraiphilippeblain@xxxxxxxxx> wrote: > > Hello, > > I've been working on a branch for a while. I've been using `git commit --fixup` and `git commit --squash` > when I noticed that I had forgotten to add something to a previous commit. > Today I did `git rebase --autosquash` to clean up my history, and the rebase failed at the > first 'fixup!' commit with a conflict. However, the conflict is not located at the right place > in the code (it's not in the right subroutine!). This is very surprising to me, and I would > like to understand why it happens. > > Steps to reproduce: > > git clone -b branch-to-be-rebased https://github.com/phil-blain/CICE.git cice > cd cice > git rebase -i --autosquash my-first-commit > # save the todo list without modifications > Auto-merging <file> > CONFLICT (content): Merge conflict in <file> > error: could not apply e8bfa55... fixup! <commit message of f4e1ae6> > # the rebase stops at f4e1ae6 > git diff > # tangential question : for some reason the hunk header does not appear here, I don't know why... > git diff -2 # but it appears here > git grep -p -e '<<<<<<< HEAD' -e '>>>>>>> e8bfa55...' # or here > # ok, the conflict appears in subroutine 'picard_solver' > git show REBASE_HEAD -U5 > # but the original "fixup!" commit only modifies the subroutine 'anderson_solver' !! > > I would have expected that the conflict be created around lines 1118-1132 > (line numbers in f4e1ae6), in the 'anderson_solver' subroutine. > > I don't know if this plays a part here, but commit f4e1ae6 (where the rebase stops) > is the commit where the 'anderson_solver' subroutine is added to the code... > > Thanks, > > Philippe. If you take a look where the rebase stops, you see: $ git ls-files -u 100644 ee4377f1ec6836fa05573976a473373906c37d9f 1 cicecore/cicedynB/dynamics/ice_dyn_vp.F90 100644 30c699ac371c2a751052fa98d04317e84a96ec47 2 cicecore/cicedynB/dynamics/ice_dyn_vp.F90 100644 276f224e9048fe0bbd7c25822695049547362c87 3 cicecore/cicedynB/dynamics/ice_dyn_vp.F90 The difference from the merge base to "other" (index 3) is pretty tiny, you just moved one line in the "anderson_solver" subroutine about 10 lines down. The more interesting difference is from the merge base to "ours" (index 2), seen with: $ git diff :1:cicecore/cicedynB/dynamics/ice_dyn_vp.F90 :2:cicecore/cicedynB/dynamics/ice_dyn_vp.F90 If you search for "anderson_solver" in that diff, you see that the reconstructed diff looks like "subroutine anderson_solver" is renamed to "subroutine picard_solver" with similar but slightly different arguments. The diff routine successfully finds lots of lines common between these two subroutines so it seems quite logical as a way of representing the changes, especially since they occur near each other in the order of the file. If you keep looking further in the diff, it says that a few hundred lines later there's another subroutine rename, this time from "subroutine calc_zeta_Pr" to "subroutine anderson_solver". Looking at it this way, it's quite natural to apply the change from the rebased commit to the picard_solver since the other side of history "renamed" anderson_solver to picard_solver. This might not make as much sense to you, since you tend to think in terms of parse trees and "subroutines" are a high level unit. But diffs work in terms of lines and have no knowledge of any kind of file structure or semantics. If you have a string of functions A,B,C and insert a new function Z on one side of history somewhere in a file that happens to look a lot like existing functions (meaning several identical lines between the two), then diffs won't necessarily treat it as a contiguous block of inserted lines but instead compare function Z on one side to function B on the other, then function B on the first side to function C on the second, etc. There are alternative diff algorithms that try to minimize the number of changed lines shown by the diff, such as --histogram and --patience, and they do shrink this particular diff, but they both yield the same treatment of comparing "subroutine anderson_solver" to "subroutine picard_solver" essentially treating it as a rename. To get a merge like you want, you'd need some kind of higher level semantic merge, or at least need functions to not have so many lines in common between them. This kind of problem also causes big issues when reordering functions in a file. line-by-line diffs and the diff3 merge algorithm tend to really struggle with those. (See e.g. https://www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf if you want to read up more on these.) Hope that helps, Elijah