On Wed, Jan 20, 2021 at 08:40:22AM -0500, Derrick Stolee wrote: > On 1/19/2021 6:24 PM, Taylor Blau wrote: > > for (m = r->objects->multi_pack_index; m; m = m->next) { > > - if (fill_midx_entry(r, oid, e, m)) > > + if (!(fill_midx_entry(r, oid, e, m))) > > nit: we don't need extra parens around fill_midx_entry(). Yep. I checked whether we should have written this as "if (fill_midx_entry(...) < 0)", but fill_midx_entry returns a positive number on error, so checking "!fill_midx_entry" is certainly what we should be doing. > > - if (!p->multi_pack_index && fill_pack_entry(oid, e, p)) { > > - list_move(&p->mru, &r->objects->packed_git_mru); > > - return 1; > > + if (p->multi_pack_index && !kept_only) { > > + /* > > + * If this pack is covered by the MIDX, we'd have found > > + * the object already in the loop above if it was here, > > + * so don't bother looking. > > + * > > + * The exception is if we are looking only at kept > > + * packs. An object can be present in two packs covered > > + * by the MIDX, one kept and one not-kept. And as the > > + * MIDX points to only one copy of each object, it might > > + * have returned only the non-kept version above. We > > + * have to check again to be thorough. > > + */ > > + continue; > > + } > > + if (!kept_only || > > + (((kept_only & ON_DISK_KEEP_PACKS) && p->pack_keep) || > > + ((kept_only & IN_CORE_KEEP_PACKS) && p->pack_keep_in_core))) { > > + if (fill_pack_entry(oid, e, p)) { > > + list_move(&p->mru, &r->objects->packed_git_mru); > > + return 1; > > + } > > Here is the meat of your patch. The comment helps a lot. > > This might have been easier if the MIDX had preferred kept packs > over non-kept packs (before sorting by modified time). Perhaps > the MIDX could get an extra field to say "I preferred kept packs" > which would let us trust the MIDX return here without the pack > loop. > > (Note: we can't just change the MIDX selection and then start > trusting all MIDXs to have the right tie-breakers because of > existing files in the wild.) Yeah, that is what makes it tricky. Changing the code isn't so hard: a new field that we check and do one of two things when we're breaking ties. But I think the cognitive load is high, and I'm not sure that the benefit (skipping another linear pass through non-MIDX'd packs when looking up an object in kept packs only _and_ that object is duplicated) is worth the extra hassle with the MIDX code. All of that said, I do think that it's worth revisiting this and giving it some more thought after multi-pack bitmaps to see whether we feel the same or not. Thanks, Taylor