On Mon, Aug 22, 2022 at 01:03:11PM -0400, Derrick Stolee wrote: > On 8/19/2022 5:30 PM, Taylor Blau wrote: > > > Resolve this by adding all objects from the preferred pack separately > > when it appears in the existing MIDX (if one was present). This will > > duplicate objects from that pack that *did* appear in the MIDX, but this > > is fine, since get_sorted_entries() already handles duplicates. (A > > future optimization in this area could avoid adding copies of objects > > that we know already existing in the MIDX.) > > ... > > > This resolves the bug described in the first patch of this series. > > Thinking ahead to when this is a commit, perhaps this could instead > refer to the 'preferred pack change with existing MIDX bitmap' test > case? Good idea, thanks. > > @@ -610,10 +609,7 @@ static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout, > > nth_midxed_pack_midx_entry(m, > > &fanout->entries[fanout->nr], > > cur_object); > > - if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack) > > - fanout->entries[fanout->nr].preferred = 1; > > - else > > - fanout->entries[fanout->nr].preferred = 0; > > + fanout->entries[fanout->nr].preferred = 0; > > fanout->nr++; > > Here, we have lost the ability to set the 'preferred' bit from the > previous MIDX. Good. Yep, we don't want to propagate any of these bits forward when reusing an existing MIDX. Thinking on it more, I think this is the only legitimate use of MIDX reuse in the "I'm about to write bitmaps" context. I mentioned before the idea that we could use `--stdin-packs` now when writing a MIDX bitmap where before it wasn't implemented (likely due to problems caused by this bug). But the whole premise doesn't make a ton of sense: - Every pack that's in the include_packs list would need to be handled separately. - And every pack that *isn't* in that list would be skipped. Which means that it wouldn't help at all to reuse an existing MIDX. The reason that we'd need to handle all included packs separately is subtle and a little different from what's going on here, though. The problem there is that if you have two packs, say P1 and P2, and P1 is in the include list but P2 is not, then any objects duplicated between the two and selected from P2 will disappear when writing the new MIDX. Since the set of packs that are going into the new MIDX are precisely equal to the set of packs that we'd need to handle separately, it probably makes sense to continue to avoid using the existing MIDX when writing a bitmap with the `--stdin-packs` option. > > @@ -694,6 +689,11 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, > > preferred, cur_fanout); > > } > > > > + if (-1 < preferred_pack && preferred_pack < start_pack) > > + midx_fanout_add_pack_fanout(&fanout, info, > > + preferred_pack, 1, > > + cur_fanout); > > + > > And here, when there is a preferred pack _in the previous MIDX_, > we add its objects a second time, but now with the preferred bit > on. If the preferred pack is _not_ in the previous MIDX, then the > 'preferred_pack < start_pack' condition will fail and the bits > would have been set within the for loop. Exactly! > > @@ -346,7 +346,7 @@ test_expect_success 'preferred pack change with existing MIDX bitmap' ' > > test_path_is_file $midx && > > test_path_is_file $midx-$(midx_checksum $objdir).bitmap && > > > > - test_must_fail git clone --no-local . clone2 > > + git clone --no-local . clone2 > > I mentioned in patch 1 that this test could use some comments about > what is unexpected and what _is_ expected. I think this comment needs > an update in this patch: > > # Generate a new MIDX which changes the preferred pack to a pack > # contained in the existing MIDX, such that not all objects from > # p2 that appear in the MIDX had their copy selected from p2. Good eyes, thanks for spotting. I updated the comment below, too (which doesn't exist in this version of the patch, but you suggested adding to the first patch in this series). Thanks, Taylor