Re: [PATCH] diffcore-rename: fix BUG when break detection and --follow used together

Elijah Newren <newren@xxxxxxxxx> · Fri, 14 Mar 2025 15:28:09 -0700

On Fri, Mar 14, 2025 at 10:24 AM Jeff King <peff@xxxxxxxx> wrote:
>
> On Sat, Mar 08, 2025 at 01:00:15AM +0000, Elijah Newren via GitGitGadget wrote:
>
> > It turns out that making a testcase to trigger this is a bit challenging
> > too.  I added a simple testcase which tickles the necessary area, but
> > running it normally actually passes for me.  However, running it under
> > valgrind shows that it is depending upon uninitialized memory.  I
> > suspect that to get a reliable reproduction case, I might need to have
> > several more paths involved, but that might make the testcase more
> > difficult to understand.  So, I instead just embedded a warning within
> > the testname that the test triggered uninitialized memory use.

Maybe I should have been a little more clear about "a bit
challenging".  I spent hours on it.  And I suspected that the only way
to trigger the reporter's particular manifestation of the issue was to
use an invalid command.  I didn't want to spend any more hours on it,
but...

> I think it's OK for a test case to require extra memory checks to fail;
> after all, these kinds of bugs are usually non-deterministic without
> those checks anyway.
>
> I did verify that it reproduces for me with "--valgrind". I was
> surprised (and a little disappointed) that it doesn't seem to trigger
> with ASan/UBSan. We do run those routinely in CI, but I doubt that
> --valgrind gets used regularly for the whole test suite by anyone these
> days, just because it's so much slower.
>
> I'm puzzled, though, why the test case at the beginning of this
> thread[1] yields the BUG() so readily, but your test case doesn't.
>
> So maybe this is the best we can do, but it feels like we should be able
> to at least trigger the existing BUG() reliably. I couldn't seem to
> figure it out, though. :(

So, after _another_ 7 hours or so on it today...  The BUG() the
reporter triggered only happens when there is no uninitialized memory
use, and is only triggerable when you use invalid flags.  For the
reporter, they passed a directory name for their pathspec along with
--follow, despite the fact that --follow only works when given a
single pathspec that names an individual file.  You can also trigger
their particular manifestation of the issue here when using a glob
pathspec together with --follow.  In either event, the --follow
becomes useless: when the follow logic checks whether filenames are
equal to the given pathspec to see if it might be a relevant rename,
no filename is exactly equal to the pathspec, so it never finds any
relevant files to follow or to include in the rename detection.  The
upshot is the command basically behaves the same as if you hadn't
given --follow, other than the fact that the presence of --follow
makes diffcore_rename throw away rename_dst pairs in a way that
happens to trigger this particular BUG().

The testcase I found is:
    seq 1 127 >numbers &&
    git add numbers &&
    git commit -m "numbers" &&

    printf "%s\n" A B C D E F G H I J K L M N O Q R S T U V W X Y Z >pool &&
    seq 1 10 >numbers &&
    git add pool numbers &&
    git commit -m "pool" &&

    git log -1 -B --raw --follow -- "p*"

The BUG will still be triggered if
  * you change the content of pool (you can even make it as similar as
you want to numbers)
  * you change the content of numbers in the second commit while
keeping it sufficiently different from the first commit
The BUG will _not_ be triggered if:
  * you change the log's pathspec to match numbers
  * you change the log's pathspec to not match pool
  * you change the log's pathspec to be exactly "pool"
  * you remove any of the three files (two versions of number and one
of pool) from the testcase
  * you reduce 127 (even if only to 126)
  * you make numbers from the second commit too similar to numbers
from the first commit

In all of these cases (and this is also true for the reporter's
original case), in locate_rename_dst(), idx will be computed as 0, but
rename_dst is NULL, so &rename_dst[idx] is NULL as well.

However, I think the fact that rename_dst == NULL implies
&rename_dst[0] == NULL should raise alarm bells about the risks of
using memory improperly, even if it doesn't directly use uninitialized
memory in this case.  Which brings us to the bigger more encompassing
issue, which is what I reported in the commit message:

> > In short, when these two rare options are used together, fix the
> > accidental find of the wrong dst entry (which would often be
> > uninitialized memory just past the end of the array), by adding a little
> > more care around the recorded indices for break_idx.

It's just the special case when rename_dst is empty and, in fact,
NULL, that you trigger the BUG() call.

> Your description of the problem and the solution both seemed sensible to
> me (though I'm not all that familiar with the ins and outs of the rename
> code these days).

Thanks.

>
> -Peff
>
> [1] The simplest I came up with is:
>
>       git clone --bare https://github.com/intel/linux-sgx.git tmp.git
>       cd tmp.git
>       git --no-pager log -B --follow 63d0e65cfa49bb46a8dbe8745bb15aaf226faa97 -- external/ippcp_internal/inc

Yeah, that was the same I was using last week.

>     Curiously, that pathspec is actually a directory, but it only has a
>     single file in it. Feeding the actual file in it _doesn't_ trigger
>     the bug:
>
>       git --no-pager log -B --follow 63d0e65cfa49bb46a8dbe8745bb15aaf226faa97 -- external/ippcp_internal/inc/ippcp20u3.patch

Yeah, I was doing the same thing last week when fixing this bug.
Sorry, I guess I should have mentioned this to avoid you attempting
the same.

>     That just shows the single commit (even though there is a single
>     file before and after that commit, it is not similar enough to find
>     a rename).
>
>     The file that hits the break detection is unrelated. It looks like
>     it's psw/ae/data/prebuilt/le_prod_css.bin.
>
>     I tried variants with break files, subdirs, etc, but I couldn't seem
>     to make anything work.

Maybe what I wrote above helps.  Is this enough information to satisfy
your curiosity?

I suspect adding this second test to the commit makes sense.  Which
parts of my explanation in this email would you like to see added to
the commit message as well, or is it fine as-is?