On Wed, Jul 28, 2021 at 01:46:21PM -0400, Jeff King wrote: > On Tue, Jul 27, 2021 at 04:05:39PM -0400, Taylor Blau wrote: > > > > I actually think having write_midx_internal() open up a new midx is > > > reasonable-ish. It's just that: > > > > > > - it's weird when it stuffs duplicate packs into the > > > r->objects->packed_git list. But AFAICT that's not actually hurting > > > anything? > > > > It is hurting us when we try to write a MIDX bitmap, because we try to > > see if one already exists. And to do that, we call prepare_bitmap_git(), > > which tries to call open_pack_bitmap_1 on *each* pack in the packed_git > > list. Critically, prepare_bitmap_git() errors out if it is called with a > > bitmap_git that has a non-NULL `->pack` pointer. > > It doesn't error out. It does produce a warning(), though, if it ignores > a bitmap (and that warning is doubly confusing because it is ignoring > bitmap X because it has already loaded and will use that exact same X!). > > This causes t7700.13 to fail because it is being picky about stderr > being empty. Right, sorry for suggesting that the error was more severe than it actually is. > So the overall behavior is correct, but I agree it's sufficiently ugly > that we should make sure it doesn't happen. 100% agreed. I think the most unfortunate thing is the state of r->objects->packed_git, since it's utterly bizarre to have the same pack opened twice and have both of those copies in the list. That is definitely worth preventing. > Side note: IMHO the "check all packs to see if there are any other > bitmaps to warn about" behavior is kind of pointless, and we should > consider just returning as soon as we have one. This is already > somewhat the case after your midx-bitmap patches, as we will not even > bother to look for a pack bitmap after finding a midx bitmap. That is > a good thing, because it means you can keep pack bitmaps around for > flexibility. But let's leave any changes to the pack-only behavior out > of this series for simplicity. I agree. I'd be in favor of something like diff --git a/pack-bitmap.c b/pack-bitmap.c index f599646e19..5450ffb04c 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -378,13 +378,6 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return -1; } - if (bitmap_git->pack || bitmap_git->midx) { - /* ignore extra bitmap file; we can only handle one */ - warning("ignoring extra bitmap file: %s", packfile->pack_name); - close(fd); - return -1; - } - bitmap_git->pack = packfile; bitmap_git->map_size = xsize_t(st.st_size); bitmap_git->map = xmmap(NULL, bitmap_git->map_size, PROT_READ, MAP_PRIVATE, fd, 0); @@ -465,16 +458,15 @@ static int open_pack_bitmap(struct repository *r, struct bitmap_index *bitmap_git) { struct packed_git *p; - int ret = -1; assert(!bitmap_git->map); for (p = get_all_packs(r); p; p = p->next) { - if (open_pack_bitmap_1(bitmap_git, p) == 0) - ret = 0; + if (!open_pack_bitmap_1(bitmap_git, p)) + return 0; } - return ret; + return -1; } static int open_midx_bitmap(struct repository *r, ...but agree that we should wait until after the dust has settled on this already-complex series. > > - We should always be operating on the repository's > > r->objects->multi_pack_index, or any other MIDX that can be reached > > via walking the `->next` pointers. If we do that consistently, then > > we'll only have at most one instance of a MIDX struct corresponding > > to each MIDX file on disk. > > Certainly that makes sense to me in terms of the Windows "must close the > current midx before writing" behavior. We have to realize that we're > operating in the current repo. > > But we do allow an "--object-dir" option to "multi-pack-index write", > and I don't see any other code explicitly requiring that it be part of > the current repository. What I'm wondering is whether this would be > breaking: > > cd $REPO/.. > git multi-pack-index --object-dir $REPO/.git/objects write > > or: > > cd /some/other/repo > git multi-pack-index --object-dir $REPO/.git/objects write > > The latter does seem to work, but the former segfaults (usually -- if > there's already a midx it is OK). The former should work, but doesn't, because (as you pointed out to me in our regular weekly discussion off-list) that the "multi-pack-index" entry in git.c's commands array has the RUN_SETUP_GENTLY option, and probably should have RUN_SETUP so that we complain with die() instead of BUG. And the latter will continue to work, but only if in your scenario that $REPO is an alternate of /some/other/repo. I wrote a little bit more in [1] about this behavior, but the upshot is that we used to technically support passing *any* directory to `--object-dir`, including directories that didn't belong to an alternated repository. And that will cease to work after the patch that [1] is in response to is applied. But for the reasons that I explain there, I think that is a sufficient outcome, because the behavior is kind of bizarre to begin with. [1]: https://lore.kernel.org/git/YQMB32fvSiH9julg@nand.local/ Thanks, Taylor