Re: [PATCH 1/8] midx-write.c: tolerate `--preferred-pack` without bitmaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 29, 2024 at 09:47:52AM +0200, Patrick Steinhardt wrote:
> On Thu, May 23, 2024 at 12:38:03PM -0400, Taylor Blau wrote:
> > When passing a preferred pack to the MIDX write machinery, we ensure
> > that the given preferred pack is non-empty since 5d3cd09a808 (midx:
> > reject empty `--preferred-pack`'s, 2021-08-31).
> >
> > However packs are only loaded (via `write_midx_internal()`, though a
> > subsequent patch will refactor this code out to its own function) when
> > the `MIDX_WRITE_REV_INDEX` flag is set.
> >
> > So if a caller runs:
> >
> >     $ git multi-pack-index write --preferred-pack=...
> >
> > with both (a) an existing MIDX, and (b) specifies a pack from that MIDX
> > as the preferred one, without passing `--bitmap`, then the check added
> > in 5d3cd09a808 will result in a segfault.
>
> The check you're talking about is the following one, right?
>
>     if (ctx.preferred_pack_idx > -1) {
>             struct packed_git *preferred = ctx.info[ctx.preferred_pack_idx].p;
>             if (!preferred->num_objects) {
>                     error(_("cannot select preferred pack %s with no objects"),
>                           preferred->pack_name);
>                     result = 1;
>                     goto cleanup;
>             }
>     }
>
> And the segfault is because the index wasn't populated, and thus
> `ctx.info[ctx.preferred_pack_idx].p == NULL`?

Exactly.

> > Note that packs loaded from disk which don't appear in an existing MIDX
> > do not trigger this issue, as those packs are loaded unconditionally. We
> > conditionally load packs from a MIDX since we tolerate MIDXs whose
> > packs do not resolve (i.e., via the MIDX write after removing
> > unreferenced packs via 'git multi-pack-index expire').
> >
> > In practice, this isn't possible to trigger when running `git
> > multi-pack-index write` from via `git repack`, as the latter always
>
> s/from via/via/

Ah, oops, thanks.

> > passes `--stdin-packs`, which prevents us from loading an existing MIDX,
> > as it forces all packs to be read from disk.
> >
> > But a future commit in this series will change that behavior to
> > unconditionally load an existing MIDX, even with `--stdin-packs`, making
> > this behavior trigger-able from 'repack' much more easily.
> >
> > Prevent this from being an issue by removing the segfault altogether by
>
> Removing segfaults is always good :)
>
> > calling `prepare_midx_pack()` on packs loaded from an existing MIDX when
> > either the `MIDX_WRITE_REV_INDEX` flag is set *or* we specified a
> > `--preferred-pack`.
> >
> > Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
> > ---
> >  midx-write.c                |  8 +++++++-
> >  t/t5319-multi-pack-index.sh | 23 +++++++++++++++++++++++
> >  2 files changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/midx-write.c b/midx-write.c
> > index 9d096d5a28..03e95ae821 100644
> > --- a/midx-write.c
> > +++ b/midx-write.c
> > @@ -930,11 +930,17 @@ static int write_midx_internal(const char *object_dir,
> >  		for (i = 0; i < ctx.m->num_packs; i++) {
> >  			ALLOC_GROW(ctx.info, ctx.nr + 1, ctx.alloc);
> >
> > -			if (flags & MIDX_WRITE_REV_INDEX) {
> > +			if (flags & MIDX_WRITE_REV_INDEX ||
> > +			    preferred_pack_name) {
> >  				/*
> >  				 * If generating a reverse index, need to have
> >  				 * packed_git's loaded to compare their
> >  				 * mtimes and object count.
> > +				 *
> > +				 * If a preferred pack is specified,
> > +				 * need to have packed_git's loaded to
> > +				 * ensure the chosen preferred pack has
> > +				 * a non-zero object count.
> >  				 */
> >  				if (prepare_midx_pack(the_repository, ctx.m, i)) {
> >  					error(_("could not load pack"));
>
> We now end up loading all packs, but in practice it should be sufficient
> to only load the preferred pack, right? Is there a particular reason why
> we now load all packs?

I had originally considered that, but discarded the idea because it
seemed like it might be fragile to depend on those two points in the
code having the exact same notion of what the preferred pack is.

Since it's cheap enough to load all of these packs in unconditionally
anyways, that's the route that I ended up taking here.

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux