On Fri, Feb 28, 2025 at 06:26:45PM -0500, Taylor Blau wrote: > On Fri, Feb 28, 2025 at 11:01:04AM +0100, Patrick Steinhardt wrote: > > On Tue, Nov 19, 2024 at 05:07:19PM -0500, Taylor Blau wrote: > > > diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt > > > index cc063b30bea..a063262c360 100644 > > > --- a/Documentation/technical/multi-pack-index.txt > > > +++ b/Documentation/technical/multi-pack-index.txt > > > @@ -164,6 +164,70 @@ objects_nr($H2) + objects_nr($H1) + i > > > (in the C implementation, this is often computed as `i + > > > m->num_objects_in_base`). > > > > > > +=== Pseudo-pack order for incremental MIDXs > > > + > > > +The original implementation of multi-pack reachability bitmaps defined > > > +the pseudo-pack order in linkgit:gitformat-pack[5] (see the section > > > +titled "multi-pack-index reverse indexes") roughly as follows: > > > + > > > +____ > > > +In short, a MIDX's pseudo-pack is the de-duplicated concatenation of > > > +objects in packs stored by the MIDX, laid out in pack order, and the > > > +packs arranged in MIDX order (with the preferred pack coming first). > > > +____ > > > + > > > +In the incremental MIDX design, we extend this definition to include > > > +objects from multiple layers of the MIDX chain. The pseudo-pack order > > > +for incremental MIDXs is determined by concatenating the pseudo-pack > > > +ordering for each layer of the MIDX chain in order. Formally two objects > > > +`o1` and `o2` are compared as follows: > > > + > > > +1. If `o1` appears in an earlier layer of the MIDX chain than `o2`, then > > > + `o1` is considered less than `o2`. > > > > Just as a refresher for myself: what is the consequence of an object > > `o1` sorting earlier than `o2`? In the case where those refer to > > different objects it is only used to establish the pseudo-pack order so > > that we know how to interpret the bitmaps. But in the case where those > > two objects refer to the same underlying object, e.g. because the object > > is contained in two packs, it also impacts which of both objects would > > be preferred e.g. during a clone, right? > > Great question -- the pseudo-pack order here is how we translate the set > of objects in a MIDX into their corresponding bit positions in the > bitmap. > > So if "o1" sorts ahead of "o2", that means that "o1" will appear in an > earlier bit position than "o2". But note that we're talking about > objects in a MIDX chain here, comprised of objects from each MIDX'd layer of > that chain. So by that point the duplicates have already been filtered > out, since: > > - The MIDX only stores one copy of an object in any given MIDX, and > > - The incremental MIDX design avoids putting objects from earlier > layers in later ones. > > I tried to get at this a few lines up with "[...] a MIDX's pseudo-pack > is the de-duplicated concatenation of [...]" to make clear that o1 != o2 > here. But let me know if you think I should clarify or emphasize that > point further. Okay, the deduplication bit was a bit subtle, so I missed that part. And once one has learned about it my question makes less sense, as I was expecting that an object may appear in the same MIDX chain multiple times. Patrick