Re: [PATCH v3 01/13] Documentation: describe incremental MIDX bitmaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 28, 2025 at 06:26:45PM -0500, Taylor Blau wrote:
> On Fri, Feb 28, 2025 at 11:01:04AM +0100, Patrick Steinhardt wrote:
> > On Tue, Nov 19, 2024 at 05:07:19PM -0500, Taylor Blau wrote:
> > > diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt
> > > index cc063b30bea..a063262c360 100644
> > > --- a/Documentation/technical/multi-pack-index.txt
> > > +++ b/Documentation/technical/multi-pack-index.txt
> > > @@ -164,6 +164,70 @@ objects_nr($H2) + objects_nr($H1) + i
> > >  (in the C implementation, this is often computed as `i +
> > >  m->num_objects_in_base`).
> > >
> > > +=== Pseudo-pack order for incremental MIDXs
> > > +
> > > +The original implementation of multi-pack reachability bitmaps defined
> > > +the pseudo-pack order in linkgit:gitformat-pack[5] (see the section
> > > +titled "multi-pack-index reverse indexes") roughly as follows:
> > > +
> > > +____
> > > +In short, a MIDX's pseudo-pack is the de-duplicated concatenation of
> > > +objects in packs stored by the MIDX, laid out in pack order, and the
> > > +packs arranged in MIDX order (with the preferred pack coming first).
> > > +____
> > > +
> > > +In the incremental MIDX design, we extend this definition to include
> > > +objects from multiple layers of the MIDX chain. The pseudo-pack order
> > > +for incremental MIDXs is determined by concatenating the pseudo-pack
> > > +ordering for each layer of the MIDX chain in order. Formally two objects
> > > +`o1` and `o2` are compared as follows:
> > > +
> > > +1. If `o1` appears in an earlier layer of the MIDX chain than `o2`, then
> > > +  `o1` is considered less than `o2`.
> >
> > Just as a refresher for myself: what is the consequence of an object
> > `o1` sorting earlier than `o2`? In the case where those refer to
> > different objects it is only used to establish the pseudo-pack order so
> > that we know how to interpret the bitmaps. But in the case where those
> > two objects refer to the same underlying object, e.g. because the object
> > is contained in two packs, it also impacts which of both objects would
> > be preferred e.g. during a clone, right?
> 
> Great question -- the pseudo-pack order here is how we translate the set
> of objects in a MIDX into their corresponding bit positions in the
> bitmap.
> 
> So if "o1" sorts ahead of "o2", that means that "o1" will appear in an
> earlier bit position than "o2". But note that we're talking about
> objects in a MIDX chain here, comprised of objects from each MIDX'd layer of
> that chain. So by that point the duplicates have already been filtered
> out, since:
> 
>   - The MIDX only stores one copy of an object in any given MIDX, and
> 
>   - The incremental MIDX design avoids putting objects from earlier
>     layers in later ones.
> 
> I tried to get at this a few lines up with "[...] a MIDX's pseudo-pack
> is the de-duplicated concatenation of [...]" to make clear that o1 != o2
> here. But let me know if you think I should clarify or emphasize that
> point further.

Okay, the deduplication bit was a bit subtle, so I missed that part. And
once one has learned about it my question makes less sense, as I was
expecting that an object may appear in the same MIDX chain multiple
times.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux