On Mon, Mar 01, 2021 at 08:17:53PM -0800, Jonathan Tan wrote: > I was initially confused that "preferred" was set twice, but this makes > sense - the first one is when an existing midx is reused, and the second > one is for objects in packs that the midx (if it exists) does not cover. Yep. Those two paths permeate a lot of the MIDX writer code, since it wants to reuse work from an existing MIDX if it can find one. > > @@ -828,7 +869,19 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * > > if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) > > goto cleanup; > > > > - ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr); > > + if (preferred_pack_name) { > > + for (i = 0; i < ctx.nr; i++) { > > + if (!cmp_idx_or_pack_name(preferred_pack_name, > > + ctx.info[i].pack_name)) { > > + ctx.preferred_pack_idx = i; > > + break; > > + } > > + } > > + } else > > + ctx.preferred_pack_idx = -1; > > Looks safer to put "ctx.preferred_pack_idx = -1" before the "if", just > in case the given pack name does not exist? Agreed. > > @@ -889,6 +942,31 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * > > pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1; > > } > > > > + /* > > + * Recompute the preferred_pack_idx (if applicable) according to the > > + * permuted pack order. > > + */ > > + ctx.preferred_pack_idx = -1; > > + if (preferred_pack_name) { > > + ctx.preferred_pack_idx = lookup_idx_or_pack_name(ctx.info, > > + ctx.nr, > > + preferred_pack_name); > > + if (ctx.preferred_pack_idx < 0) > > + warning(_("unknown preferred pack: '%s'"), > > + preferred_pack_name); > > + else { > > + uint32_t orig = ctx.info[ctx.preferred_pack_idx].orig_pack_int_id; > > + uint32_t perm = ctx.pack_perm[orig]; > > + > > + if (perm == PACK_EXPIRED) { > > + warning(_("preferred pack '%s' is expired"), > > + preferred_pack_name); > > + ctx.preferred_pack_idx = -1; > > + } else > > + ctx.preferred_pack_idx = perm; > > + } > > + } > > I couldn't figure out why the preferred pack index needs to be > recalculated here, since the pack entries would have already been > sorted. Also, the tests still pass when I comment this part out. A > comment describing what's going on would be helpful. Funny you mention that; I was wondering the same thing myself the other day when reading these patches again before deploying them to a couple of testing repositories at GitHub. It is totally unnecessary: since we have already marked objects from the preferred pack in get_sorted_entries(), the rest of the code doesn't care if the preferred pack was permuted or not. But we *do* care if the pack which was preferred expired. The 'git repack --geometric --write-midx' caller (which will appear in a later series) should never do that, so emitting a warning() is worthwhile. I think ultimately you want something like this squashed in: --- >8 --- diff --git a/midx.c b/midx.c index d2c56c4bc6..46f55ff6cf 100644 --- a/midx.c +++ b/midx.c @@ -582,7 +582,7 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, struct pack_info *info, uint32_t nr_packs, uint32_t *nr_objects, - uint32_t preferred_pack) + int preferred_pack) { uint32_t cur_fanout, cur_pack, cur_object; uint32_t alloc_fanout, alloc_objects, total_objects = 0; @@ -869,6 +869,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) goto cleanup; + ctx.preferred_pack_idx = -1; if (preferred_pack_name) { for (i = 0; i < ctx.nr; i++) { if (!cmp_idx_or_pack_name(preferred_pack_name, @@ -877,8 +878,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * break; } } - } else - ctx.preferred_pack_idx = -1; + } ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, ctx.preferred_pack_idx); @@ -942,28 +942,21 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1; } - /* - * Recompute the preferred_pack_idx (if applicable) according to the - * permuted pack order. - */ - ctx.preferred_pack_idx = -1; + /* Check that the preferred pack wasn't expired (if given). */ if (preferred_pack_name) { - ctx.preferred_pack_idx = lookup_idx_or_pack_name(ctx.info, - ctx.nr, - preferred_pack_name); - if (ctx.preferred_pack_idx < 0) + int preferred_idx = lookup_idx_or_pack_name(ctx.info, + ctx.nr, + preferred_pack_name); + if (preferred_idx < 0) warning(_("unknown preferred pack: '%s'"), preferred_pack_name); else { - uint32_t orig = ctx.info[ctx.preferred_pack_idx].orig_pack_int_id; + uint32_t orig = ctx.info[preferred_idx].orig_pack_int_id; uint32_t perm = ctx.pack_perm[orig]; - if (perm == PACK_EXPIRED) { + if (perm == PACK_EXPIRED) warning(_("preferred pack '%s' is expired"), preferred_pack_name); - ctx.preferred_pack_idx = -1; - } else - ctx.preferred_pack_idx = perm; } }