On Fri, Jul 23, 2021 at 04:29:37AM -0400, Jeff King wrote: > On Wed, Jul 21, 2021 at 03:22:34PM -0400, Taylor Blau wrote: > > > > > This avoids a problem that would arise in subsequent patches due to the > > > > combination of 'git repack' reopening the object store in-process and > > > > the multi-pack index code not checking whether a pack already exists in > > > > the object store when calling add_pack_to_midx(). > > > > > > > > This would ultimately lead to a cycle being created along the > > > > 'packed_git' struct's '->next' pointer. That is obviously bad, but it > > > > has hard-to-debug downstream effects like saying a bitmap can't be > > > > loaded for a pack because one already exists (for the same pack). > > > > > > I'm not sure I completely understand the bug that this causes. > > > > Off-hand, I can't quite remember either. But it is important; I do have > > a distinct memory of dropping this patch and then watching a 'git repack > > --write-midx' (that option will be introduced in a later series) fail > > horribly. > > > > If I remember correctly, the bug has to do with loading a MIDX twice in > > the same process. When we call add_packed_git() from within > > prepare_midx_pack(), we load the pack without caring whether or not it's > > already loaded. So loading a MIDX twice in the same process will fail. > > > > So really I think that this is papering over that bug: we're just > > removing one of the times that we happened to load a MIDX from during > > the writing phase. > > Hmm, after staring at this for a bit, I've unconfused and re-confused > myself several times. > > Here are some interesting bits: > > - calling load_multi_pack_index() directly creates a new midx object. > None of its m->packs[] array will be filled in. Nor is it reachable > as r->objects->multi_pack_index. > > - in using that midx, we end up calling prepare_midx_pack() for > various packs, which creates a new packed_git struct and adds it to > r->objects->packed_git (via install_packed_git()). > > So that's a bit weird already, because we have packed_git structs in > r->objects that came from a midx that isn't r->objects->multi_pack_index. > And then if we later call prepare_multi_pack_index(), for example as > part of a pack reprepare, then we'd end up with duplicates. Ah, this jogged my memory: this is a relic from when we generated MIDX bitmaps in-process with the rest of the `repack` code. And when we did that, we did have to call `reprepare_packed_git()` after writing the new packs but before moving them into place. So that's where the `reprepare_packed_git()` came from, but we don't have any of that code anymore, since we now generate MIDX bitmaps by invoking: git multi-pack-index write --bitmap --stdin-packs --refs-snapshot as a sub-process of `git repack`; no need for any reprepare which is what was triggering this bug. To be sure, I reverted this patch out of GitHub's fork, and reran the tests both in normal mode (just `make test`) and then once more with the `GIT_TEST_MULTI_PACK_INDEX{,_WRITE_BITMAP}` environment variables set. Unsurprisingly, it passed both times. I'm happy to keep digging further, but I think that I'm 99% satisfied here. Digging further involves resurrecting a much older version of this series (and others adjacent to it), and there are probably other bugs lurking that would be annoying to tease out. In any case, let's drop this patch from the series. It's disappointing that we can't run: git -c core.multiPackIndex= multi-pack-index write anymore, but I guess that's no worse than the state we were in before this patch, so I'm content to let it live on. Thanks, Taylor