Re: [PATCH] diffcore-rename: fix BUG when break detection and --follow used together

Jeff King <peff@xxxxxxxx> · Mon, 17 Mar 2025 13:51:35 -0400

On Fri, Mar 14, 2025 at 03:28:09PM -0700, Elijah Newren wrote:

> > So maybe this is the best we can do, but it feels like we should be able
> > to at least trigger the existing BUG() reliably. I couldn't seem to
> > figure it out, though. :(
> 
> So, after _another_ 7 hours or so on it today...  The BUG() the
> reporter triggered only happens when there is no uninitialized memory
> use, and is only triggerable when you use invalid flags.  For the
> reporter, they passed a directory name for their pathspec along with
> --follow, despite the fact that --follow only works when given a
> single pathspec that names an individual file.  You can also trigger
> their particular manifestation of the issue here when using a glob
> pathspec together with --follow.  In either event, the --follow
> becomes useless: when the follow logic checks whether filenames are
> equal to the given pathspec to see if it might be a relevant rename,
> no filename is exactly equal to the pathspec, so it never finds any
> relevant files to follow or to include in the rename detection.  The
> upshot is the command basically behaves the same as if you hadn't
> given --follow, other than the fact that the presence of --follow
> makes diffcore_rename throw away rename_dst pairs in a way that
> happens to trigger this particular BUG().

OK, that all makes sense (and what I was trying to reduce the case, too,
but somehow it eluded me).

Sorry, I didn't mean for you to spend all day on it. I was just
wondering if there was low-hanging fruit we could convince to trigger
with ASan (which would much better protect against regression). It does
look like we at least got some benefit, though. ;)

> The testcase I found is:
>     seq 1 127 >numbers &&
>     git add numbers &&
>     git commit -m "numbers" &&
> 
>     printf "%s\n" A B C D E F G H I J K L M N O Q R S T U V W X Y Z >pool &&
>     seq 1 10 >numbers &&
>     git add pool numbers &&
>     git commit -m "pool" &&
> 
>     git log -1 -B --raw --follow -- "p*"

Yeah, that's similar to what I was trying, but I didn't try the globbing
pathspec. I stuck everything in a subdirectory and use subdir/. But I
think I got caught up on...

> The BUG will _not_ be triggered if:
>   * you change the log's pathspec to match numbers

I put both files into the subdir/, so they both matched.

> In all of these cases (and this is also true for the reporter's
> original case), in locate_rename_dst(), idx will be computed as 0, but
> rename_dst is NULL, so &rename_dst[idx] is NULL as well.

Makes sense.

> However, I think the fact that rename_dst == NULL implies
> &rename_dst[0] == NULL should raise alarm bells about the risks of
> using memory improperly, even if it doesn't directly use uninitialized
> memory in this case.  Which brings us to the bigger more encompassing
> issue, which is what I reported in the commit message:
> 
> > > In short, when these two rare options are used together, fix the
> > > accidental find of the wrong dst entry (which would often be
> > > uninitialized memory just past the end of the array), by adding a little
> > > more care around the recorded indices for break_idx.
> 
> It's just the special case when rename_dst is empty and, in fact,
> NULL, that you trigger the BUG() call.

Oh, absolutely. The uninitialized memory is the bigger problem, and the
fact that it BUG()-ed at all was mostly lucky.

> Maybe what I wrote above helps.  Is this enough information to satisfy
> your curiosity?

Definitely.

> I suspect adding this second test to the commit makes sense.  Which
> parts of my explanation in this email would you like to see added to
> the commit message as well, or is it fine as-is?

I read over your v2, and I think the explanation there is good. In the
long run we might eventually disallow --follow on the glob like this
(since as you note, it's invalid and doesn't actually work). In which
case I guess we'd end up deleting that test case. But in the meantime, I
think there's some benefit to having it.

-Peff