Daniel Barkalow <barkalow@xxxxxxxxxxxx> writes: > Maybe merge_trees should get the two sides as structs with a struct tree * > and a char * branch name, and the struct could someday get optional struct > commit *s for the lineage of the side if somebody comes up with a method > for merging that makes use of the component changes. I do agree with you that with some ancestry-based hinting the "find renames" part of merge_trees() postprocessing can do a better job than the current code. Contrary to a widespread misconception we often hear on this list and on #git, merge-recursive detects renames solely by looking at the three endpoints. Some people incorrectly recommend "commit a pure rename, then commit modifications to the renamed path"; such an artificial split would not at all help merging histories with renames if the modification made after renames are too great. Suppose you have this history, where upper branch renames a path and modifies the contents in the renamed path in commits X, Y and Z. You would want to merge the history leading to B to your HEAD, A, to create a merge M: X---Y---Z---B X---Y---Z---B / ===> / \ ---O---o---o---A ---O---o---o---A---M Maybe the rename done between O and X were pure enough that "diff-tree -M O X" would have found it, but commits Y, Z, or B changed the contents of the renamed path too greatly for "diff-tree -M O B" to notice the rename anymore. "git merge B" when you are at A will not find such a rename, but if we allowed rename detection stepwise to look at "diff-tree -M" for (O,X), (X,Y), (Y,Z), (Z,B), (and the same for history between O and A), we might be able to find renames better [*1*]. Suppose instead of merging the whole history leading to B, you would want to apply the small fix made with Y on top of A: X---Y---Z---B X---Y---Z---B / ===> / ---O---o---o---A ---O---o---o---A---Y' You would "checkout A && cherry-pick Y" which amounts to three-way merge using X as "common", A as "mine" and Y as "his", which is the moral equivalent of: read-tree -m -u X A Y but with rename detection. merge_trees(A, Y, X) could traverse ancestry between X and A to find O and figure out where in A the paths that are affected in diff(X,Y) appear by making pairwise "diff-tree -M" to go from X back to O and then forward to A to find out that where the paths touched by Y were originally in O and where they are in A. But you are assuming that "common" and "ours" have any ancestry relationship. You generally cannot. The most extreme case is "am -3" where both "common" and "theirs" are pure trees without any ancestry relationship to anything else. They are built by looking at the "index" lines contained in the patch and contain only those paths that are affected by the patch, and the merge machinery merges that change into arbitrary "ours". You can cherry-pick a change C from history that does not have any common ancestor with your history leading to "ours" ("common"=C^ and "theirs"=C in this case), and the same applies to revert of C ("common"=C and "theirs"=C^). "rebase --onto A C^ C" works the same way. So this "extra ancestry information to help rename discovery" can at most be a hint. More importantly, all of the above does not have anything to do with the "recursiveness" of merge-recursive (which is the difference between merge_trees() and merge_recursive()) at all. So I am still correct to say that "cherry-pick", "revert", "checkout -m", "am -3" and "stash apply" should use the merge_trees() interface, not the recursive one. [Footnote] *1* But that would be quite more expensive than what we currently do, and it only deals with an uninteresting special case of wholesale file rename. I suspect if we ever do the stepwise thing, we would be better off doing not "diff-tree -M" but "blame -C" to really find where the lines affected between O and B ended up in A, and apply the change there. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html