Junio C Hamano <gitster@xxxxxxxxx> writes: > Thomas Rast <trast@xxxxxxxxxxxxxxx> writes: > >> This would be the first backwards coupling between the revision-walk and >> the diff generation parts, at least that I know of. > > I am not convinced if you need to have any unusual back-coupling to > begin with, by the way. > > If you say "git log -p [--options] -- pathspec", the revision > machinery does filter commits that do not touch any paths that patch > pathspec with the TREESAME logic, but that does not necessarily mean > you will see _all_ the commits that are not TREESAME. Let's clarify this part a bit. Imagine you have three commits on a single strand of pearls. A---B---C Between A and B, you only corrected indentation errors in file F as a clean-up step, and then between B and C, you did a real change. "git diff A B -- F" gives changes, "git diff -w A B -- F" does not. Both "git diff B C -- F" and "git diff -w B C -- F" give you some output. Now, "git log --no-root -p -w C -- F" will give you C but not B. Let's see how that happens. The revision machinery looks at C and finds its parent B. It runs object level tree comparison and finds that their trees are different at path F. It makes a mental note that it may need to show the log message of C, and asks the diff machinery to run diff-tree between B and C. The diff machinery finds that it needs to show something even in the presense of -w option by actual comparison, and just before showing the very first line of patch output, it shows the log message of C (due to the earlier "mental note"). Then the revision machinery looks at B. It does the same between B and A, but this time around, the diff machinery finds that, even though A and B were _not_ TREESAME at the revision traversal level, there is nothing to be shown after filtering with the -w option. Hence no patch is shown and log message for B is not shown, either. The interesting part of the above is that we did not bother using the -w comparison to affect TREESAME check at the revision traversal level. I consider your "-L bottom,top" essentially the same "two blobs may be different at the object name level, but after filtering the diff, there may be no output" filter just like the "-w" option is. Instead of filtering hunks that have changes only in whitespaces, it filters hunks that do not overlap the given line range. So if you replace "-w" in the above example with "-L bottom,top", exactly the same thing should happen. The current code structure does not consider the content level filtering done by the "-w" when computing TREESAME. This DOES affect how the merges are simplified. If you and I start from a commit X with full of indentation mistakes, you fix the earlier half of these mistakes and make commit T while I fix the later half of these mistakes and make commit J, and we merge our results into merge M: J---M / / X---T none of "diff -w J M", "diff -w T M", "diff -w X J", "diff -w X T" will output anything. "git diff -p -w M" will however traverse both branches, even though in this example nothing will be shown in the output, because no two trees in these four commits are TREESAME. In the longer term, this _may_ be something we want to fix, but teaching the revision machinery about only "-L bottom,top" is not the way to do it. Instead, we should devise a new mechanism to call into the diff_patch machinery from the place where the revision machinery computes TREESAME, and let it ask the filtered content level differences if any of these options (these flags flip the DIFF_FROM_CONTENTS diff option) are in use. As a part of the implementation of that new mechanism, we may want to cache the patch such an early comparison produces, and reuse it when we actually produce output, as an optimization. But I think that is not limited to the "-L bottom,top" option and an orthogonal issue, as it will work equally well for existing "-w" filter (there may be others). The above is where my recommendations are coming from. The first step for "-L bottom,top" should be to hook it as a new kind of DIFF_FROM_CONTENTS option that compares two non-identical blobs but possibly yield empty result into the diff machinery at the same level as "-w". Once that is solidly done, we can think about updating the merge simplification logic to take DIFF_FROM_CONTENTS changes into account when it computes TREESAME. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html