On Tue, Feb 16, 2021 at 04:42:38PM -0500, Jeff King wrote: > On Wed, Feb 03, 2021 at 10:58:50PM -0500, Taylor Blau wrote: > > > Future callers will want a function to fill a 'struct pack_entry' for a > > given object id but _only_ from its position in any kept pack(s). > > > > In particular, an new 'git repack' mode which ensures the resulting > > Nit (not worth re-rolling): s/an new/a new/ Oops. Good eyes. > > There is a gotcha when looking up objects that are duplicated in kept > > and non-kept packs, particularly when the MIDX stores the non-kept > > version and the caller asked for kept objects only. This could be > > resolved by teaching the MIDX to resolve duplicates by always favoring > > the kept pack (if one exists), but this breaks an assumption in existing > > MIDXs, and so it would require a format change. > > I don't think this would be possible without a major rethink of how > midxs work. The "keep" property of a pack is not set in stone when the > midx is created. You could add a ".keep" file to one of its packs later, > or even mark one as an in-core keep on the fly. But the duplicate > resolution happens at creation. > > So maybe your "breaks an assumption" is the notion that we do not store > duplicate information at all in the midx. If so, then I agree. :) But > I'd also call fixing that more than just a format change. That's part of it, indeed. The part that I was referring to is that existing MIDX readers expect duplicates to be resolved in a certain way (effectively in favor of the pack with the lowest mtime). So the easy part is indicating a format change which tells new readers how to expect ties to be broken. But (as you note) that's only part of the problem: even if we say "ties are resolved in favor of the lowest mtime pack, or a .keep one, if it exists", then which ones are kept and which aren't? Even *if* we wrote that down (which I'm not suggesting we do), kept-ness isn't an immutable property of the pack, and so I think relying on it is a tricky direction to take. > (None of which changes your point, which isn't that it isn't worth > pursuing that direction). Yeah; my hope in writing some of this down in the above paragraph is that it would make clear to future readers that such a MIDX change would resolve some complexity here, but the complexity it adds in the MIDX code isn't worth the tradeoff. > -Peff Thanks, Taylor