Re: [PATCH 6/9] Documentation/technical: describe multi-pack reverse indexes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 10, 2021 at 09:48:20PM -0500, Derrick Stolee wrote:
> nit: use multi-pack-index or MIDX, not lower-case 'midx'.

Thanks.

> > +Crucially, the objects' positions within this pseudo-pack are the same
> > +as their bit positions in a multi-pack reachability bitmap.
> > +
> > +As a motivating example, consider the multi-pack reachability bitmap
> > +(which does not yet exist, but is what we are building towards here). We
> > +need each bit to correspond to an object covered by the midx, and we
> > +need to be able to convert bit positions back to index positions (from
> > +which we can get the oid, etc).
>
> These paragraphs are awkward. Instead of operating in the hypothetical
> world of reachability bitmaps, focus on the fact that bitmaps need
> a bidirectional mapping between "bit position" and an object ID.

Hmm. I could buy that these paragraphs are awkward, but I'm not sure
that what you proposed makes it less so.

I may be a bad person to judge what you wrote, since I am familiar with
the details of what it's describing. But my thoughts on that second and
third paragraph are basically:

  - define the valid orderings we might consider objects in a MIDX by,
    indicating which of those orderings we're going to use for
    multi-pack bitmaps

  - motivate the need for a mapping between lexicographic order and
    pseudo-pack order

> Here is an attempt to reword some of the context you are using here.
> Feel free to take as much or as little as you want.
>
>   The multi-pack-index stores the object IDs in lexicographical order
>   (lex-order) to allow binary search. To allow compressible reachability
>   bitmaps to pair with a multi-pack-index, a different ordering is
>   required. When paired with a single packfile, the order used is the
>   object order within the packfile (called the pack-order). Construct
>   a "pseudo-pack" by concatenating all tracked packfiles in the
>   multi-pack-index. We now need a mapping between the lex-order and the
>   pseudo-pack-order.

I struggled with what you wrote because I couldn't seem to neatly
place/replace that paragraph in with the existing text without referring
to yet-undefined concepts.

Maybe the confusion lies in the fact that we stray too far from the
point in the second and third paragraphs. What if we reordered the
second, third, and fourth paragraph like this:

		Instead of mapping between offset, pack-, and index position, this
		reverse index maps between an object's position within the MIDX, and
		that object's position within a pseudo-pack that the MIDX describes.

		To clarify these three orderings, consider a multi-pack reachability
		bitmap (which does not yet exist, but is what we are building towards
		here). Each bit needs to correspond to an object in the MIDX, and so we
		need an efficient mapping from bit position to MIDX position.

		One solution is to let bits occupy the same position in the oid-sorted
		index stored by the MIDX. But because oids are effectively random, there
		resulting reachability bitmaps would have no locality, and thus compress
		poorly. (This is the reason that single-pack bitmaps use the pack
		ordering, and not the .idx ordering, for the same purpose.)

		So we'd like to define an ordering for the whole MIDX based around
		pack ordering, which has far better locality (and thus compresses more
		efficiently). We can think of a pseudo-pack created by the concatenation
		of all of the packs in the MIDX. E.g., if we had a MIDX with three packs
		(a, b, c), with 10, 15, and 20 objects respectively, we can imagine an
		ordering of the objects like:

> [snip]
>
> The rest of these details make sense and sufficiently motivate the
> ordering, once the concept is clear.
>
> Thanks,
> -Stolee

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux