Re: [PATCH 6/6] midx.c: include preferred pack correctly with existing MIDX

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 22, 2022 at 01:03:11PM -0400, Derrick Stolee wrote:
> On 8/19/2022 5:30 PM, Taylor Blau wrote:
>
> > Resolve this by adding all objects from the preferred pack separately
> > when it appears in the existing MIDX (if one was present). This will
> > duplicate objects from that pack that *did* appear in the MIDX, but this
> > is fine, since get_sorted_entries() already handles duplicates. (A
> > future optimization in this area could avoid adding copies of objects
> > that we know already existing in the MIDX.)
>
> ...
>
> > This resolves the bug described in the first patch of this series.
>
> Thinking ahead to when this is a commit, perhaps this could instead
> refer to the 'preferred pack change with existing MIDX bitmap' test
> case?

Good idea, thanks.

> > @@ -610,10 +609,7 @@ static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout,
> >  		nth_midxed_pack_midx_entry(m,
> >  					   &fanout->entries[fanout->nr],
> >  					   cur_object);
> > -		if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack)
> > -			fanout->entries[fanout->nr].preferred = 1;
> > -		else
> > -			fanout->entries[fanout->nr].preferred = 0;
> > +		fanout->entries[fanout->nr].preferred = 0;
> >  		fanout->nr++;
>
> Here, we have lost the ability to set the 'preferred' bit from the
> previous MIDX. Good.

Yep, we don't want to propagate any of these bits forward when reusing
an existing MIDX. Thinking on it more, I think this is the only
legitimate use of MIDX reuse in the "I'm about to write bitmaps"
context.

I mentioned before the idea that we could use `--stdin-packs` now when
writing a MIDX bitmap where before it wasn't implemented (likely due to
problems caused by this bug). But the whole premise doesn't make a ton
of sense:

  - Every pack that's in the include_packs list would need to be handled
    separately.

  - And every pack that *isn't* in that list would be skipped.

Which means that it wouldn't help at all to reuse an existing MIDX. The
reason that we'd need to handle all included packs separately is subtle
and a little different from what's going on here, though. The problem
there is that if you have two packs, say P1 and P2, and P1 is in the
include list but P2 is not, then any objects duplicated between the two
and selected from P2 will disappear when writing the new MIDX.

Since the set of packs that are going into the new MIDX are precisely
equal to the set of packs that we'd need to handle separately, it
probably makes sense to continue to avoid using the existing MIDX when
writing a bitmap with the `--stdin-packs` option.

> > @@ -694,6 +689,11 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
> >  						    preferred, cur_fanout);
> >  		}
> >
> > +		if (-1 < preferred_pack && preferred_pack < start_pack)
> > +			midx_fanout_add_pack_fanout(&fanout, info,
> > +						    preferred_pack, 1,
> > +						    cur_fanout);
> > +
>
> And here, when there is a preferred pack _in the previous MIDX_,
> we add its objects a second time, but now with the preferred bit
> on. If the preferred pack is _not_ in the previous MIDX, then the
> 'preferred_pack < start_pack' condition will fail and the bits
> would have been set within the for loop.

Exactly!

> > @@ -346,7 +346,7 @@ test_expect_success 'preferred pack change with existing MIDX bitmap' '
> >  		test_path_is_file $midx &&
> >  		test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
> >
> > -		test_must_fail git clone --no-local . clone2
> > +		git clone --no-local . clone2
>
> I mentioned in patch 1 that this test could use some comments about
> what is unexpected and what _is_ expected. I think this comment needs
> an update in this patch:
>
> 	# Generate a new MIDX which changes the preferred pack to a pack
> 	# contained in the existing MIDX, such that not all objects from
> 	# p2 that appear in the MIDX had their copy selected from p2.

Good eyes, thanks for spotting. I updated the comment below, too (which
doesn't exist in this version of the patch, but you suggested adding to
the first patch in this series).

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux