Re: [PATCH 01/10] packfile: introduce 'find_kept_pack_entry()'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 28, 2021 at 06:33:10PM -0800, Junio C Hamano wrote:
> Taylor Blau <me@xxxxxxxxxxxx> writes:
>
> > Future callers will want a function to fill a 'struct pack_entry' for a
> > given object id but _only_ from its position in any kept pack(s). They
> > could accomplish this by calling 'find_pack_entry()' and checking
> > whether the found pack is kept or not, but this is insufficient, since
> > there may be duplicate objects (and the mru cache makes it unpredictable
> > which variant we'll get).
>
> I wonder if we eventually need a callback interface to walk _all_
> pack entries for a given object, so that "I am only interested in
> instances in kept packs" will be under total control of the callers.
> As it stands, it is "just grab any one that is in a kept pack, any
> one of them is fine", which is almost just of as narrow utility as
> the original's "just grab the first one---any one of them is fine",
> the latter of which is "insufficient" as the log message says.
>
> But this (in the context of the remainder of the series) might be
> sufficient, at least for now.

As you note, it's more about "can I find this object in any kept pack
(of a certain kind)" versus, "show me this object in a pack" (and hope
that if it appears in a kept pack, that that's the copy that is picked).

> > Teach this new function to treat the two different kinds of kept packs
> > (on disk ones with .keep files, as well as in-core ones which are set by
> > manually poking the 'pack_keep_in_core' bit) separately. This will
> > become important for callers that only want to respect a certain kind of
> > kept pack.
>
> Or maybe not ;-)

:-). The difference here is that we will only want to stop the traversal
at packs which are considered to be stable from the perspective of a
geometric repack.

We mark those packs as "stable" by setting their in-core kept bit, but
we don't write ".keep" files (which would make them on-disk kept). The
latter is up to the user, not us.

> If there are notable relationship between on-disk and in-core kept
> packs (e.g. "the set of on-disk kept packs is a subset of in-core
> kept packs", "usually on-disk kept packs get in-core kept bit upon
> their packed_git instances are populated, but we can drop the bit at
> runtime, so on-disk and in-core are pretty much independent and
> there is no notable relationship"), it must be explained upfront to
> help the reader form a sensible world view.

Unfortunately, I don't think that there is a sensible world-view here
to be formed. Honestly, the distinction between .keep packs and in-core
kept packs is incredibly narrow, and I find our separate handling of
them awkward and error-prone.

But, it is sort of what you'd want here (i.e., a way to mark all objects
in a pack as ignored without actually writing the physical file that
says "ignore all objects in this pack").

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux