Re: MSan failures in pack-bitmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 07, 2024 at 07:43:56PM -0700, Kyle Lippincott wrote:

> I believe what's happening is that pack-bitmap.c:2091 grows the packs
> list and sets up some of the fields, but doesn't set pack_int_id. We
> then use it at pack-bitmap.c:1888.
> 
> I investigated, but couldn't prove to myself what value should be
> placed there while growing it, or if it's incorrect to read from it in
> this case (so we shouldn't be in pack-bitmap.c:1888 with this pack).

Hmm, I'm not sure.

In reuse_partial_packfile_from_bitmap(), the code path that creates the
struct only kicks in when the "multi_pack_reuse" flag isn't set. Which
generally would correspond to whether we have a midx. And then the code
in try_partial_reuse() that uses the struct similarly checks
bitmap_is_midx() before looking at the pack_int_id field.

But that changed in 795006fff4 (pack-bitmap: gracefully handle missing
BTMP chunks, 2024-04-15), where we also disable multi_pack_reuse if we
have a midx but it has no BTMP chunk. So we end up in the non-multi code
path to create the struct, but then try_partial_reuse() still realizes
we have a midx and uses that code path.

I guess this gets into the "we have a midx, but are only doing reuse out
of a single pack" case. Which I think is supported, but I'm not familiar
enough with the code to know where the assumption is going wrong.

Anyway...

> Reproducing is potentially non-trivial. This may work:
> 
> make -j CFLAGS="-fsanitize=memory -fsanitize-memory-track-origins
> -fno-omit-frame-pointer -g -O2" CC=clang && \
> make -C t t5326-multi-pack-bitmaps.sh GIT_TEST_OPTS="--verbose --debug"

Yeah, I forgot what a pain MSan is:

  - I needed to use a recent version of clang (the default for Debian
    unstable is clang-16, which is missing intercepts for newer versions
    of glibc strtoimax(), and complains that it does not set the
    outgoing "end" pointer). Using clang-19 fixed that.

  - Since I didn't recompile all of the dependent libraries with MSan
    support, I hit the usual zlib problems discussed long ago in:

     https://lore.kernel.org/git/20171004101932.pai6wzcv2eohsicr@xxxxxxxxxxxxxxxxxxxxx/

    and had to resurrect the patch there.

  - While bisecting (which is how I found 795006fff4), some versions
    complain that attr.c does:

      buf = read_blob_data_from_index(istate, path, &size);
      stack = read_attr_from_buf(buf, size, path, flags);

    If reading failed, then "buf" is NULL and size is uninitialized.
    Inside read_attr_from_buf() we bail immediately when we see the NULL
    buf, but MSan complains that we passed the uninitialized "size" at
    all. Newer versions (starting with c793f9cb08) move the NULL check
    into the caller.

An easier (but slower) reproduction is to just use valgrind. After doing
a normal build, running:

  ./t5326-multi-pack-bitmaps.sh --run=1-24 --valgrind-only=24 -i

is enough to see the problem.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux