Jeff King <peff@xxxxxxxx> writes: > From the user's perspective, calling X "rerere" would probably be OK[1]. > But from an implementation perspective (and to keep the existing > plumbing available and unchanged), it probably makes sense to call it > something else, and have it run both rerere and a new plumbing command > to do the merge-fix work (or call it nothing, and assume that users will > either touch the plumbing directly or will use "git merge" to trigger > both). > ... > I think it should be its own plumbing tool that merge calls alongside > rerere. ;) As long as we use the database keyed with <A,B> and take the merge base into account, "git am" and "git cherry-pick" would not be able to use the merge-fix machinery, so in that sense, calling X "rerere" would not be OK, but I agree with your general sentiment about the UI visible to the end users. Just like "rerere" started with a small step to avoid automating things too much and then later became almost invisible for normal cases because we managed to automate it reasonably well and integrate it into mergy operations, we may be able to do the same for merge-fix machinery. My "this belongs to 'merge'" is primarily coming from it---it might be reusable in other mergy operations with some fundamental changes, but I envision that the primarly and only user of that X would initially be 'merge'. > Not having thought too hard about it yet, this containing relationship > seems like the right direction. I guess you'd do the lookup by computing > the merge-base M of <X,Y> (which we already know anyway), walking M..X > and looking for any entries which mention those commits (in either A or > B slots of the entry), and then similarly narrowing it according to > M..Y. Yes, every time I look at the Reintegrate script, I try to think of an efficient implementation but do not find anything better than the left-right walk over X...Y range, so that is my conclusion after having thought about it very hard as well ;-) > What if instead of commit hashes we used patch ids? Now you may be onto something. While we aim at the ultimately automated ideal that would catch the maximal cases, for the earlier 'xyzzy turns into frotz' example, the minimal cue to identify one side of the pair that keys the "change this new instance of xyzzy into frotz, too" merge-fix is a hunk like this: -const char *xyzzy; +const char *frotz; It does not matter what other changes also appear in the same commit, and my original "branch name" is way too broad, and my previous "ideal" which is "a single problematic commit" is still broader than necessary. Well, patch-id identifies the entire change contained in a single commit, so it is still too broad, but if we can narrow it down to a single hunk like that, perhaps we can use it as one side of the key. And the other side of the key is naturally a hunk like this: + printf("%s\n", xyzzy); i.e. a new use of xyzzy appears where it didn't exist before. And when we merge two branches, one of which has one half of the key (i.e. "const char *xyzzy;" turned into "const char *frotz"), and the other has the other half of the key (i.e. "printf xyzzy" is added), then we'd apply a patch that tells us to do this: - printf("%s\n", xyzzy); + printf("%s\n", frotz); i.e. that patch would be the value in the database keyed by the pair of two previous hunks. > I think it's asking a lot for users to handle the textual conflicts and > semantic ones separately. It would be nice if we could tell them apart > automatically (and I think we can based on what isn't part of the > conflict resolution). After thinking about this a bit more, I realized that it is possible to mechanically sift a human generated resolution that has both textual conflict resolution (which will be handled by "rerere") and semantic one (which needs the merge-fix machinery) into two, without requiring the user to make two separate commits. The key trick is that "rerere" is capable of recording and replaying semantic conflict resolution made to a file that happens to have textual conflict just fine. Because rerere database records the preimage (i.e. with conflict markers) and postimage (i.e. the final resolution for the file) as an entire file contents, it can do 3-way merge for parts of the same file that is away from any conflicted region. If you tell contrib/rerere-train.sh to learn from "pu^{/^Merge branch 'bp/fsmonitor' into pu}", you'll see what I mean. $ git show "pu^{/^Merge branch 'bp/fsmonitor' into pu}" compat/bswap.h shows the result of such resolution. Early part of the combined diff shown by the above command comes from textual conflict resolution, but there is a new implementation of inline version of get_be64 that has become necessary but did not exist in either of the branches getting merged, which is shown as an evil merge. So the way to automatically sift an existing merge into textual conflict resolution (i.e. automatable with "rerere") and semantic evil merge-fix is to first try to recreate the merge and enumerate the paths that get conflicts. Then resolve these paths by grabbing the resolved result for these paths that get textual conflicts out of the original merge. That gives us the textual part (we can run "git rerere" at this point to store the resolution away to the rerere database---which is essentially what contrib/rerere-train does). There will be difference between the result of the above and the original merge commit. The paths that are different are all outside these conflicted paths, and they are what we want in the second commit, i.e. the semantic evil merge-fix, which will be the value for the merge-fix database. So that part is easily automatable. It is much harder to come up with a way to index into the merge-fix database. We can "punt" and use the same index scheme as the Reintegrate script (i.e. key it with the name of the branch being merged, which can be read from the original merge) as the initial approximation, and that would already be much better than what Reintegrate script currently does, in that the recording part is much more automated (in the workflow I use the Reintegrate script, the side that records a merge-fix initially is entirely manual).