Re: [PATCH v4 01/13] Documentation: describe incremental MIDX bitmaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 14, 2025 at 1:18 PM Taylor Blau <me@xxxxxxxxxxxx> wrote:
>
> Prepare to implement support for reachability bitmaps for the new
> incremental multi-pack index (MIDX) feature over the following commits.
>
> This commit begins by first describing the relevant format and usage
> details for incremental MIDX bitmaps.
>
> Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
> ---
>  Documentation/technical/multi-pack-index.adoc | 71 +++++++++++++++++++
>  1 file changed, 71 insertions(+)
>
> diff --git a/Documentation/technical/multi-pack-index.adoc b/Documentation/technical/multi-pack-index.adoc
> index cc063b30be..ab98ecfeb9 100644
> --- a/Documentation/technical/multi-pack-index.adoc
> +++ b/Documentation/technical/multi-pack-index.adoc
> @@ -164,6 +164,77 @@ objects_nr($H2) + objects_nr($H1) + i
>  (in the C implementation, this is often computed as `i +
>  m->num_objects_in_base`).
>
> +=== Pseudo-pack order for incremental MIDXs
> +
> +The original implementation of multi-pack reachability bitmaps defined
> +the pseudo-pack order in linkgit:gitformat-pack[5] (see the section
> +titled "multi-pack-index reverse indexes") roughly as follows:
> +
> +____
> +In short, a MIDX's pseudo-pack is the de-duplicated concatenation of
> +objects in packs stored by the MIDX, laid out in pack order, and the
> +packs arranged in MIDX order (with the preferred pack coming first).
> +____
> +
> +In the incremental MIDX design, we extend this definition to include
> +objects from multiple layers of the MIDX chain. The pseudo-pack order
> +for incremental MIDXs is determined by concatenating the pseudo-pack
> +ordering for each layer of the MIDX chain in order. Formally two objects
> +`o1` and `o2` are compared as follows:
> +
> +1. If `o1` appears in an earlier layer of the MIDX chain than `o2`, then
> +  `o1` is considered less than `o2`.

For sorting order, 'less than' doesn't tell us if you are sorting
smallest to greatest or greatest to smallest.  Maybe "less than (so
its order is earlier than) `o2'" ?

> +
> +2. Otherwise, if `o1` and `o2` appear in the same MIDX layer, and that
> +   MIDX layer has no base, then if one of `pack(o1)` and `pack(o2)` is
> +   preferred and the other is not, then the preferred one sorts first. If
> +   there is a base layer (i.e. the MIDX layer is not the first layer in
> +   the chain), then if `pack(o1)` appears earlier in that MIDX layer's
> +   pack order, than `o1` is less than `o2`. Likewise if `pack(o2)`

s/than/then/

> +   appears earlier, than the opposite is true.

s/than/then/

> +
> +3. Otherwise, `o1` and `o2` appear in the same pack, and thus in the
> +   same MIDX layer. Sort `o1` and `o2` by their offset within their
> +   containing packfile.
> +
> +Note that the preferred pack is a property of the MIDX chain, not the
> +individual layers themselves. Fundamentally we could introduce a
> +per-layer preferred pack, but this is less relevant now that we can
> +perform multi-pack reuse across the set of packs in a MIDX.
> +
> +=== Reachability bitmaps and incremental MIDXs
> +
> +Each layer of an incremental MIDX chain may have its objects (and the
> +objects from any previous layer in the same MIDX chain) represented in
> +its own `*.bitmap` file.
> +
> +The structure of a `*.bitmap` file belonging to an incremental MIDX
> +chain is identical to that of a non-incremental MIDX bitmap, or a
> +classic single-pack bitmap. Since objects are added to the end of the
> +incremental MIDX's pseudo-pack order (see: above), it is possible to

drop the colon?

> +extend a bitmap when appending to the end of a MIDX chain.
> +
> +(Note: it is possible likewise to compress a contiguous sequence of MIDX
> +incremental layers, and their `*.bitmap`(s) into a single layer and
> +`*.bitmap`, but this is not yet implemented.)

"`*.bitmap`(s)" feels slightly awkward and only saves 2 characters.
Maybe just "`*.bitmap` files"?

> +
> +The object positions used are global within the pseudo-pack order, so
> +subsequent layers will have, for example, `m->num_objects_in_base`
> +number of `0` bits in each of their four type bitmaps. This follows from
> +the fact that we only write type bitmap entries for objects present in
> +the layer immediately corresponding to the bitmap).
> +
> +Note also that only the bitmap pertaining to the most recent layer in an
> +incremental MIDX chain is used to store reachability information about
> +the interesting and uninteresting objects in a reachability query.
> +Earlier bitmap layers are only used to look up commit and pseudo-merge
> +bitmaps from that layer, as well as the type-level bitmaps for objects
> +in that layer.
> +
> +To simplify the implementation, type-level bitmaps are iterated
> +simultaneously, and their results are OR'd together to avoid recursively
> +calling internal bitmap functions.
> +
>  Future Work
>  -----------

Should the patch also remove the first item from Future Work, since
this series is implementing it?


> --
> 2.49.0.13.gd0d564685b





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux