Re: [PATCH v2 1/6] Documentation/technical: describe bitmap lookup table extension

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 27, 2022 at 10:18:51AM -0400, Derrick Stolee wrote:
> On 6/26/2022 9:10 AM, Abhradeep Chakraborty via GitGitGadget wrote:
> > +			triplets. The format and meaning of the table is described
> > +			below.
> > ++
> > +NOTE: This xor_offset is different from the bitmap's xor_offset.
> > +Bitmap's xor_offset is relative i.e. it tells how many bitmaps we have
> > +to go back from the current bitmap. Lookup table's xor_offset tells the
> > +position of the triplet in the list whose bitmap the current commit's
> > +bitmap have to xor with.
>
> I found this difficult to parse. Here is an attempt at a rewording. Please
> let me know if I misunderstood something when reading your version:
>
>   NOTE: The xor_offset stored in the BITMAP_OPT_LOOKUP_TABLE is different
>   from the xor_offset used in the bitmap data table. The xor_offset in this
>   table indicates the row number within this table of the commit whose
>   bitmap is used for the XOR computation with the current commit's stored
>   bitmap to create the proper logical reachability bitmap.
>
> This does make me think that "xor_offset" should really be "xor_row" or
> something like that.

To be fair, I found Stolee's version equally difficult to parse. I
wonder if something like the following would be clearer:

    NOTE: Unlike the xor_offset used to compress an individual bitmap,
    this value stores an *absolute* index into the lookup table, not a
    location relative to the current entry.

> > +For a `.bitmap` containing `nr_entries` reachability bitmaps, the table
> > +contains a list of `nr_entries` <commit pos, offset, xor offset> triplets.
> > +The content of i'th triplet is -
> > +
> > +	* {empty}
> > +	commit pos (4 byte integer, network byte order): ::
> > +	It stores the object position of the commit (in the midx or pack index)
> > +	to which the i'th bitmap in the bitmap entries belongs.
>
> Ok, we are saving some space here, but relying on looking into the pack-index
> or multi-pack-index to get the actual commit OID.
>
> Since this is sorted by the order that stores the bitmaps, binary search will
> no longer work on this list (unless we enforce that on the rest of the bitmap
> file). I am going to expect that you parse this table into a hashmap in order
> to allow fast commit lookups. I'll keep an eye out for that implementation.

The main purpose of this series is to avoid having to construct such a
table ahead of time. This is more or less akin to what the existing
implementation already does in load_bitmap_entries_v1(), though that
function has to read (but not decompress!) all bitmaps.

But I disagree that this isn't binary searchable. The object positions
are in MIDX or pack .idx order, so they are sorted lexicographically.
The comparator implementation could either take as its key an object_id,
and then convert each of the "commit pos" fields themselves to
object_ids and call oidcmp().

Or we could go the other way (as it looks like Abhradeep did in a later
patch) and convert the key's object_id into the index or MIDX-relative
position, and search for that.

> > +	* {empty}
> > +	offset (8 byte integer, network byte order): ::
> > +	The offset from which that commit's bitmap can be read.
> > +
> > +	* {empty}
> > +	xor offset (4 byte integer, network byte order): ::
> > +	It holds the position of the triplet with whose bitmap the
> > +	current bitmap need to xor. If the current triplet's bitmap
> > +	do not have any xor bitmap, it defaults to 0xffffffff.
>
> This last sentence seems backward. Perhaps:
>
>   If the value is 0xffffffff, then the current bitmap has no xor bitmap.

Perhaps even more concisely:

    The position of a triplet whose bitmap is used to compress this one,
    or 0xffffffff if no such bitmap exists.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux