On Fri, Jun 07, 2024 at 07:43:56PM -0700, Kyle Lippincott wrote: > I believe what's happening is that pack-bitmap.c:2091 grows the packs > list and sets up some of the fields, but doesn't set pack_int_id. We > then use it at pack-bitmap.c:1888. > > I investigated, but couldn't prove to myself what value should be > placed there while growing it, or if it's incorrect to read from it in > this case (so we shouldn't be in pack-bitmap.c:1888 with this pack). Hmm, I'm not sure. In reuse_partial_packfile_from_bitmap(), the code path that creates the struct only kicks in when the "multi_pack_reuse" flag isn't set. Which generally would correspond to whether we have a midx. And then the code in try_partial_reuse() that uses the struct similarly checks bitmap_is_midx() before looking at the pack_int_id field. But that changed in 795006fff4 (pack-bitmap: gracefully handle missing BTMP chunks, 2024-04-15), where we also disable multi_pack_reuse if we have a midx but it has no BTMP chunk. So we end up in the non-multi code path to create the struct, but then try_partial_reuse() still realizes we have a midx and uses that code path. I guess this gets into the "we have a midx, but are only doing reuse out of a single pack" case. Which I think is supported, but I'm not familiar enough with the code to know where the assumption is going wrong. Anyway... > Reproducing is potentially non-trivial. This may work: > > make -j CFLAGS="-fsanitize=memory -fsanitize-memory-track-origins > -fno-omit-frame-pointer -g -O2" CC=clang && \ > make -C t t5326-multi-pack-bitmaps.sh GIT_TEST_OPTS="--verbose --debug" Yeah, I forgot what a pain MSan is: - I needed to use a recent version of clang (the default for Debian unstable is clang-16, which is missing intercepts for newer versions of glibc strtoimax(), and complains that it does not set the outgoing "end" pointer). Using clang-19 fixed that. - Since I didn't recompile all of the dependent libraries with MSan support, I hit the usual zlib problems discussed long ago in: https://lore.kernel.org/git/20171004101932.pai6wzcv2eohsicr@xxxxxxxxxxxxxxxxxxxxx/ and had to resurrect the patch there. - While bisecting (which is how I found 795006fff4), some versions complain that attr.c does: buf = read_blob_data_from_index(istate, path, &size); stack = read_attr_from_buf(buf, size, path, flags); If reading failed, then "buf" is NULL and size is uninitialized. Inside read_attr_from_buf() we bail immediately when we see the NULL buf, but MSan complains that we passed the uninitialized "size" at all. Newer versions (starting with c793f9cb08) move the NULL check into the caller. An easier (but slower) reproduction is to just use valgrind. After doing a normal build, running: ./t5326-multi-pack-bitmaps.sh --run=1-24 --valgrind-only=24 -i is enough to see the problem. -Peff