Re: [PATCH v5 2/6] pack-bitmap-write.c: write lookup table extension

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 20, 2022 at 06:38:20PM +0000, Abhradeep Chakraborty via GitGitGadget wrote:
> From: Abhradeep Chakraborty <chakrabortyabhradeep79@xxxxxxxxx>
>
> The bitmap lookup table extension was documented by an earlier
> change, but Git does not yet know how to write that extension.
>
> Teach Git to write bitmap lookup table extension. The table contains
> the list of `N` <commit_pos, offset, xor_row>` triplets. These
> triplets are sorted according to their commit pos (ascending order).
> The meaning of each data in the i'th triplet is given below:
>
>   - commit_pos stores commit position (in the pack-index or midx).
>     It is a 4 byte network byte order unsigned integer.
>
>   - offset is the position (in the bitmap file) from which that
>     commit's bitmap can be read.
>
>   - xor_row is the position of the triplet in the lookup table
>     whose bitmap is used to compress this bitmap, or `0xffffffff`
>     if no such bitmap exists.
>
> Mentored-by: Taylor Blau <me@xxxxxxxxxxxx>
> Co-mentored-by: Kaartic Sivaraam <kaartic.sivaraam@xxxxxxxxx>
> Co-authored-by: Taylor Blau <me@xxxxxxxxxxxx>
> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@xxxxxxxxx>
> ---
>  pack-bitmap-write.c | 112 ++++++++++++++++++++++++++++++++++++++++----
>  pack-bitmap.h       |   5 +-
>  2 files changed, 107 insertions(+), 10 deletions(-)
>
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index c43375bd344..9843790cb60 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -650,20 +650,19 @@ static const struct object_id *oid_access(size_t pos, const void *table)
>
>  static void write_selected_commits_v1(struct hashfile *f,
>  				      struct pack_idx_entry **index,
> -				      uint32_t index_nr)
> +				      uint32_t index_nr,
> +				      off_t *offsets,
> +				      uint32_t *commit_positions)
>  {
>  	int i;
>
>  	for (i = 0; i < writer.selected_nr; ++i) {
>  		struct bitmapped_commit *stored = &writer.selected[i];
>
> -		int commit_pos =
> -			oid_pos(&stored->commit->object.oid, index, index_nr, oid_access);
> +		if (offsets)
> +			offsets[i] = hashfile_total(f);
>
> -		if (commit_pos < 0)
> -			BUG("trying to write commit not in index");
> -
> -		hashwrite_be32(f, commit_pos);
> +		hashwrite_be32(f, commit_positions[i]);

I wonder if it would make this patch a little more readable to construct
and use the commit_positions array as a single preparatory step before
this commit.

What do you think?

> +static void write_lookup_table(struct hashfile *f,
> +			       struct pack_idx_entry **index,
> +			       uint32_t index_nr,
> +			       off_t *offsets,
> +			       uint32_t *commit_positions)
> +{
> +	uint32_t i;
> +	uint32_t *table, *table_inv;
> +
> +	ALLOC_ARRAY(table, writer.selected_nr);
> +	ALLOC_ARRAY(table_inv, writer.selected_nr);
> +
> +	for (i = 0; i < writer.selected_nr; i++)
> +		table[i] = i;
> +
> +	/*
> +	 * At the end of this sort table[j] = i means that the i'th
> +	 * bitmap corresponds to j'th bitmapped commit (among the selected
> +	 * commits) in lex order of OIDs.
> +	 */
> +	QSORT_S(table, writer.selected_nr, table_cmp, commit_positions);
> +
> +	/* table_inv helps us discover that relationship (i'th bitmap
> +	 * to j'th commit by j = table_inv[i])
> +	 */
> +	for (i = 0; i < writer.selected_nr; i++)
> +		table_inv[table[i]] = i;
> +
> +	trace2_region_enter("pack-bitmap-write", "writing_lookup_table", the_repository);
> +	for (i = 0; i < writer.selected_nr; i++) {
> +		struct bitmapped_commit *selected = &writer.selected[table[i]];
> +		uint32_t xor_offset = selected->xor_offset;
> +		uint32_t xor_row;
> +
> +		if (xor_offset) {
> +			/*
> +			 * xor_index stores the index (in the bitmap entries)
> +			 * of the corresponding xor bitmap. But we need to convert
> +			 * this index into lookup table's index. So, table_inv[xor_index]
> +			 * gives us the index position w.r.t. the lookup table.
> +			 *
> +			 * If "k = table[i] - xor_offset" then the xor base is the k'th
> +			 * bitmap. `table_inv[k]` gives us the position of that bitmap
> +			 * in the lookup table.
> +			 */
> +			uint32_t xor_index = table[i] - xor_offset;
> +			xor_row = table_inv[xor_index];
> +		} else {
> +			xor_row = 0xffffffff;
> +		}
> +
> +		hashwrite_be32(f, commit_positions[table[i]]);
> +		hashwrite_be64(f, (uint64_t)offsets[table[i]]);
> +		hashwrite_be32(f, xor_row);
> +	}
> +	trace2_region_leave("pack-bitmap-write", "writing_lookup_table", the_repository);
> +
> +	free(table);
> +	free(table_inv);
> +}


> @@ -715,7 +791,25 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
>  	dump_bitmap(f, writer.trees);
>  	dump_bitmap(f, writer.blobs);
>  	dump_bitmap(f, writer.tags);
> -	write_selected_commits_v1(f, index, index_nr);
> +
> +	ALLOC_ARRAY(commit_positions, writer.selected_nr);
> +	for (uint32_t i = 0; i < writer.selected_nr; ++i) {

Nit; we don't typically write for-loop expressions with variable
declarations inside of them. Make sure to declare i outside of the loop,
and then this becomes:

    for (i = 0; i < writer.selected_nr; i++)

(also, we typically use the postfix ++ operator, that is "i++" instead
of "++i" unless there is a reason to prefer the latter over the former).

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux