On Thu, Jan 28, 2021 at 06:33:10PM -0800, Junio C Hamano wrote: > Taylor Blau <me@xxxxxxxxxxxx> writes: > > > Future callers will want a function to fill a 'struct pack_entry' for a > > given object id but _only_ from its position in any kept pack(s). They > > could accomplish this by calling 'find_pack_entry()' and checking > > whether the found pack is kept or not, but this is insufficient, since > > there may be duplicate objects (and the mru cache makes it unpredictable > > which variant we'll get). > > I wonder if we eventually need a callback interface to walk _all_ > pack entries for a given object, so that "I am only interested in > instances in kept packs" will be under total control of the callers. > As it stands, it is "just grab any one that is in a kept pack, any > one of them is fine", which is almost just of as narrow utility as > the original's "just grab the first one---any one of them is fine", > the latter of which is "insufficient" as the log message says. > > But this (in the context of the remainder of the series) might be > sufficient, at least for now. As you note, it's more about "can I find this object in any kept pack (of a certain kind)" versus, "show me this object in a pack" (and hope that if it appears in a kept pack, that that's the copy that is picked). > > Teach this new function to treat the two different kinds of kept packs > > (on disk ones with .keep files, as well as in-core ones which are set by > > manually poking the 'pack_keep_in_core' bit) separately. This will > > become important for callers that only want to respect a certain kind of > > kept pack. > > Or maybe not ;-) :-). The difference here is that we will only want to stop the traversal at packs which are considered to be stable from the perspective of a geometric repack. We mark those packs as "stable" by setting their in-core kept bit, but we don't write ".keep" files (which would make them on-disk kept). The latter is up to the user, not us. > If there are notable relationship between on-disk and in-core kept > packs (e.g. "the set of on-disk kept packs is a subset of in-core > kept packs", "usually on-disk kept packs get in-core kept bit upon > their packed_git instances are populated, but we can drop the bit at > runtime, so on-disk and in-core are pretty much independent and > there is no notable relationship"), it must be explained upfront to > help the reader form a sensible world view. Unfortunately, I don't think that there is a sensible world-view here to be formed. Honestly, the distinction between .keep packs and in-core kept packs is incredibly narrow, and I find our separate handling of them awkward and error-prone. But, it is sort of what you'd want here (i.e., a way to mark all objects in a pack as ignored without actually writing the physical file that says "ignore all objects in this pack"). Thanks, Taylor