On Thu, May 23, 2024 at 12:38:03PM -0400, Taylor Blau wrote: > When passing a preferred pack to the MIDX write machinery, we ensure > that the given preferred pack is non-empty since 5d3cd09a808 (midx: > reject empty `--preferred-pack`'s, 2021-08-31). > > However packs are only loaded (via `write_midx_internal()`, though a > subsequent patch will refactor this code out to its own function) when > the `MIDX_WRITE_REV_INDEX` flag is set. > > So if a caller runs: > > $ git multi-pack-index write --preferred-pack=... > > with both (a) an existing MIDX, and (b) specifies a pack from that MIDX > as the preferred one, without passing `--bitmap`, then the check added > in 5d3cd09a808 will result in a segfault. The check you're talking about is the following one, right? if (ctx.preferred_pack_idx > -1) { struct packed_git *preferred = ctx.info[ctx.preferred_pack_idx].p; if (!preferred->num_objects) { error(_("cannot select preferred pack %s with no objects"), preferred->pack_name); result = 1; goto cleanup; } } And the segfault is because the index wasn't populated, and thus `ctx.info[ctx.preferred_pack_idx].p == NULL`? > Note that packs loaded from disk which don't appear in an existing MIDX > do not trigger this issue, as those packs are loaded unconditionally. We > conditionally load packs from a MIDX since we tolerate MIDXs whose > packs do not resolve (i.e., via the MIDX write after removing > unreferenced packs via 'git multi-pack-index expire'). > > In practice, this isn't possible to trigger when running `git > multi-pack-index write` from via `git repack`, as the latter always s/from via/via/ > passes `--stdin-packs`, which prevents us from loading an existing MIDX, > as it forces all packs to be read from disk. > > But a future commit in this series will change that behavior to > unconditionally load an existing MIDX, even with `--stdin-packs`, making > this behavior trigger-able from 'repack' much more easily. > > Prevent this from being an issue by removing the segfault altogether by Removing segfaults is always good :) > calling `prepare_midx_pack()` on packs loaded from an existing MIDX when > either the `MIDX_WRITE_REV_INDEX` flag is set *or* we specified a > `--preferred-pack`. > > Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> > --- > midx-write.c | 8 +++++++- > t/t5319-multi-pack-index.sh | 23 +++++++++++++++++++++++ > 2 files changed, 30 insertions(+), 1 deletion(-) > > diff --git a/midx-write.c b/midx-write.c > index 9d096d5a28..03e95ae821 100644 > --- a/midx-write.c > +++ b/midx-write.c > @@ -930,11 +930,17 @@ static int write_midx_internal(const char *object_dir, > for (i = 0; i < ctx.m->num_packs; i++) { > ALLOC_GROW(ctx.info, ctx.nr + 1, ctx.alloc); > > - if (flags & MIDX_WRITE_REV_INDEX) { > + if (flags & MIDX_WRITE_REV_INDEX || > + preferred_pack_name) { > /* > * If generating a reverse index, need to have > * packed_git's loaded to compare their > * mtimes and object count. > + * > + * If a preferred pack is specified, > + * need to have packed_git's loaded to > + * ensure the chosen preferred pack has > + * a non-zero object count. > */ > if (prepare_midx_pack(the_repository, ctx.m, i)) { > error(_("could not load pack")); We now end up loading all packs, but in practice it should be sufficient to only load the preferred pack, right? Is there a particular reason why we now load all packs? Patrick
Attachment:
signature.asc
Description: PGP signature