Re: [PATCH v2 2/6] pack-bitmap-write.c: write lookup table extension

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 27, 2022 at 10:35:25AM -0400, Derrick Stolee wrote:
> On 6/26/2022 9:10 AM, Abhradeep Chakraborty via GitGitGadget wrote:
>
> > +	uint32_t a = positions[*(uint32_t *)_va];
> > +	uint32_t b = positions[*(uint32_t *)_vb];
> > +
> > +	if (a > b)
> > +		result = 1;
> > +	else if (a < b)
> > +		result = -1;
> > +	else
> > +		result = 0;
> > +
> > +	return result;
> > +}
>
> Ok, here you are sorting by commit OID (indirectly by the order in the
> [multi-]pack-index). I suppose that I misunderstood in the previous
> patch, so that could use some more specific language, maybe.

Yeah, I agree that some more specific language could be used, with the
main idea being there that we make it clearer that the list of tuples is
still sorted (and can be binary searched).

> > +	for (i = 0; i < writer.selected_nr; i++)
> > +		table[i] = i;
> > +
> > +	QSORT_S(table, writer.selected_nr, table_cmp, commit_positions);
>
> At the end of this sort, table[j] = i means that the ith bitmap corresponds
> to the jth bitmapped commit in lex order of OIDs.
>
> > +	for (i = 0; i < writer.selected_nr; i++)
> > +		table_inv[table[i]] = i;
>
> And table_inv helps us discover that relationship (ith bitmap to jth commit
> by j = table_inv[i]).

These are both great descriptions and should give an idea of what sort
of information is worth putting into a comment.
>
> > +	for (i = 0; i < writer.selected_nr; i++) {
> > +		struct bitmapped_commit *selected = &writer.selected[table[i]];
> > +		uint32_t xor_offset = selected->xor_offset;
>
> Here, xor_offset is "number of bitmaps in relationship to the current bitmap"

It's an offset to an earlier commit which must be used to XOR-decompress the
current one (if any).

> > +		hashwrite_be32(f, commit_positions[table[i]]);
> > +		hashwrite_be64(f, offsets[table[i]]);
> > +		hashwrite_be32(f, xor_offset ?
> > +				table_inv[table[i] - xor_offset]: 0xffffffff);
>
> Which means that if "k = table[i] - xor_offset" that the xor base is the kth
> bitmap. table_inv[k] gets us the position in this table of that bitmap's
> commit.

Yes, exactly. Abhradeep: this is also worth commenting ;-).

> (It's also strange to me that the offset is being _subtracted_, but I guess
> the bitmap format requires the xor base to appear first so the offset does
> not need to be a negative number ever.)

You're right, this follows from the fact that the XOR bases must come
before the commits who must use them to decompress themselves. From
Documentation/technical/bitmap-format.txt:

    This number is always positive, and hence entries are always xor'ed
    with **previous** bitmaps, not bitmaps that will come afterwards in
    the index.

> This last line is a bit complex.
>
> 	uint32_t xor_offset = selected->xor_offset;
> 	uint32_t xor_row = 0xffffffff;
>
> 	if (xor_offset) {
> 		uint32_t xor_order = table[i] - xor_offset;
> 		xor_row = table_inf[xor_order];
> 	}
>
> ...then we can "hashwrite_be32(f, xor_row);" when necessary. I'm not sure
> that we need the "uint32_t xor_order" inside the "if (xor_offset)" block,
> but splitting it helps add clarity to the multi-step computation.

I had the same thought, though I would also say that xor_row should be
declared, not initialized, and the "else" block of "if (xor_offset)"
should set it to 0xffffffff to make the relationship between xor_offset
and the value written a little clearer.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux