On Sun, Feb 23, 2025 at 2:30 AM Devste Devste <devstemail@xxxxxxxxx> wrote: > > I have a merge commit that includes 2 modified (!) files: What do you mean that it only includes 2 modified files? Modified relative to what? Modified relative to the merge base of its parents? Modified relative to its first parent? to its second parent? Modified relative to an automatic merge? Also, by "modified" here do you mean the change type is 'M' in --name-status output or could the change type also be 'A' (added) or 'D'(deleted) or something else? > hello/foo/stubs/example.php > hello/world.php > > I want to only get the changes introduced by the merge commit and > exclude any changes in /foo/stubs/: > git diff -l0 --name-status --find-renames "$sha"^'!' -- ':!*/foo/stubs/*' It's not clear to me from your example what the output of say git diff --name-status --no-renames "$sha"^'!' | wc -l would be, though I would find that very interesting. I'm also curious what you'd get from each of git diff --diff-filter=D --name-status --no-renames "$sha"^'!' | wc -l git diff --diff-filter=A --name-status --no-renames "$sha"^'!' | wc -l git diff --diff-filter=M --name-status --no-renames "$sha"^'!' | wc -l (and yes, I am very intentionally leaving off the ':!*/foo/stubs/*' negative refspec; I want the output without that.) > Git takes more than 4 minutes to generate this diff, since > hello/foo/stubs/example.php is a huge file. How do you know that is the reason? Especially since... > When using --no-renames (instead of --find-renames) it's much, much faster. ...this seems to contradict your statement that the reason for the slow diff is that hello/foo/stubs/example.php is a huge file. > And without the example.php file, the diff takes less than 1 second > instead of 4+ minutes. What do you mean without the example.php file? Did you rewind history, remove that file, and then redo the merge so that it is no longer included? Or do you mean something else entirely? What exactly? > Funnily enough, when I have a merge commit that contains only that 1 > excluded file, it's the same behavior. > > 1) if there's only a single file in a commit, why does --find-renames > cause a slowdown? There's nothing that could have been renamed in that > case (probably the same for --find-copies) I'm not sure what this has to do with the above; you seem to have switched tracks. If you have a commit whose toplevel tree has exactly 1 file, and you're diffing it against some other commit with an unspecified number of files, then if that other commit with N files happens to have a file with the same name as the commit with exactly 1 file, then --find-renames can't really cause a slowdown. It'd only cause a slowdown when the N files in the other commit were all different filenames than the 1 file in your commit you are diffing against (but of mostly similar filesize). But I suspect you meant something other than what you said here. Could you clarify the actual setup? > 2) could rename detection be "delayed" to only run/check if there are > actually additions/deletions (and possibly only check those)? If a > commit only contains modifications (unlike in a really, really 0.0001% > edge case) but no additions+deletions it's extremely unlikely that > there's a rename, so detection could be skipped altogether? Rename detection already does this; in fact, it does better. Not only can you exit early when additions + deletions are empty, you can also exit early when either of the two are empty. (In fact, there's some other optimizations as well, such as exiting early if either additions or deletions become empty after removing any paths involved in exact rename detection, or removing any paths involved in basename-driven rename matching.) If you want to see where this is handled; see the "if (!num_destinations || !num_sources)" check in diffcore-rename.c. Now, all that said, I suspect you're getting at something with the negative refspecs that is similar to the optimization idea I had for a real --follow-renames, but before I jump into that, I'd need you to clarify your setup a fair amount to make sure we're on the same page.