On Wed, Jul 21, 2021 at 01:53:40PM -0400, Taylor Blau wrote: > > > + The ordering between packs is done lexicographically by the pack name, > > > + with the exception of the preferred pack, which sorts ahead of all other > > > + packs. > > > > Hmm, I'm not sure if this "lexicographically" part is true. Really we're > > building on the midx .rev format here. And that says "defined by the > > MIDX's pack list" (though I can't offhand remember if that is > > lexicographic, or if it is in the reverse-mtime order). > > > > At any rate, should we just be referencing the rev documentation? > > The packs are listed in lex order in the MIDX, but that is so we can > binary search that list to determine whether a pack is included in the > MIDX or not. > > I had to check, but we do use the lex order to resolve duplicate > objects, too. See (at the tip of this branch): > > QSORT(ctx.info, ctx.nr, pack_info_compare); > > from within midx.c:write_midx_internal(). Here, ctx.info contains the > list of packs, and pack_info_compare is a thin wrapper around > strcmp()-ing the pack_name values of two packed_git structures. Ah, OK, thanks for checking. > Arguably, you'd get better EWAH compression of the bits between packs > if we sorted packs in reverse order according to their mtime. But I > suspect that it doesn't matter much in practice, since the number of > objects vastly outpaces the number of packs (but I haven't measured to > be certain, so take that with a grain of salt). Agreed, especially when the intended use is with geometric repacking to keep reasonable-sized packs. Either way, I think heuristics to optimize the pack ordering can easily come on top later. Let's keep this series focused on the fundamentals of having midx bitmaps at all. > In any case, I think that you're right that adding too much detail hurts > us here, so we should really be mentioning the MIDX's .rev-file > documentation (unfortunately, we can't linkgit it, so mentioning it by > name will have to suffice). I plan to reroll with something like this on > top: > > diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt > index 25221c7ec8..04b3ec2178 100644 > --- a/Documentation/technical/bitmap-format.txt > +++ b/Documentation/technical/bitmap-format.txt > @@ -26,9 +26,8 @@ An object is uniquely described by its bit position within a bitmap: > > o1 <= o2 <==> pack(o1) <= pack(o2) /\ offset(o1) <= offset(o2) > > - The ordering between packs is done lexicographically by the pack name, > - with the exception of the preferred pack, which sorts ahead of all other > - packs. > + The ordering between packs is done according to the MIDX's .rev file. > + Notably, the preferred pack sorts ahead of all other packs. > > The on-disk representation (described below) of a bitmap is the same regardless > of whether or not that bitmap belongs to a packfile or a MIDX. The only Thanks, that looks much better. We can't linkgit, but we only build HTML for these. So just a link to pack-format.html would work, as they'd generally be found side-by-side in the filesystem. But since this doesn't even really render as asciidoc, I'm not sure I care either way. (Obviously we could also mention pack-format.txt by name, but it's probably already obvious-ish to a human that this is where you'd find information on the pack .rev format). -Peff