Re: [PATCH 3/6] pack-bitmap-write.c: write lookup table extension

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 20, 2022 at 12:33:11PM +0000, Abhradeep Chakraborty via GitGitGadget wrote:
> From: Abhradeep Chakraborty <chakrabortyabhradeep79@xxxxxxxxx>
>
> Teach git to write bitmap lookup table extension. The table has the
> following information:
>
>     - `N` no of Object ids of each bitmapped commits

s/no/number, s/Object/object, s/ids/IDs, and s/commits/commit

>     - A list of offset, xor-offset pair; the i'th pair denotes the
>       offsets and xor-offsets of i'th commit in the previous list.

s/pair/pairs

>     - 4-byte integer denoting the flags
>
> Co-authored-by: Taylor Blau <ttaylorr@xxxxxxxxxx>
> Mentored-by: Taylor Blau <ttaylorr@xxxxxxxxxx>
> Co-mentored-by: Kaartic Sivaraam <kaartic.sivaraam@xxxxxxxxx>
> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@xxxxxxxxx>
> ---
>  pack-bitmap-write.c | 59 +++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index c43375bd344..9e88a64dd65 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -650,7 +650,8 @@ static const struct object_id *oid_access(size_t pos, const void *table)
>
>  static void write_selected_commits_v1(struct hashfile *f,
>  				      struct pack_idx_entry **index,
> -				      uint32_t index_nr)
> +				      uint32_t index_nr,
> +				      off_t *offsets)
>  {
>  	int i;
>
> @@ -663,6 +664,9 @@ static void write_selected_commits_v1(struct hashfile *f,
>  		if (commit_pos < 0)
>  			BUG("trying to write commit not in index");
>
> +		if (offsets)
> +			offsets[i] = hashfile_total(f);
> +

Makes sense; we record the offset for the ith commit as however many
bytes we've already written into the hashfile up to this point, since
the subsequent byte will begin the bitmap (well, the preceding few
bytes of it, anyways) itself.

>  		hashwrite_be32(f, commit_pos);
>  		hashwrite_u8(f, stored->xor_offset);
>  		hashwrite_u8(f, stored->flags);
> @@ -671,6 +675,49 @@ static void write_selected_commits_v1(struct hashfile *f,
>  	}
>  }
>
> +static int table_cmp(const void *_va, const void *_vb)
> +{
> +	return oidcmp(&writer.selected[*(uint32_t*)_va].commit->object.oid,
> +		      &writer.selected[*(uint32_t*)_vb].commit->object.oid);

This implementation looks right to me, but perhaps we should expand it
out from the one-liner here to make it more readable. Perhaps something
like:

    static int table_cmp(const void *_va, const void *_vb)
    {
      struct commit *c1 = &writer.selected[*(uint32_t*)_va];
      struct commit *c2 = &writer.selected[*(uint32_t*)_vb];

      return oidcmp(&c1->object.oid, &c2->object.oid);
    }

which is arguably slightly more readable than the one-liner (but I don't
feel that strongly about it.)

> +static void write_lookup_table(struct hashfile *f,
> +			       off_t *offsets)
> +{
> +	uint32_t i;
> +	uint32_t flags = 0;
> +	uint32_t *table, *table_inv;
> +
> +	ALLOC_ARRAY(table, writer.selected_nr);
> +	ALLOC_ARRAY(table_inv, writer.selected_nr);
> +
> +	for (i = 0; i < writer.selected_nr; i++)
> +		table[i] = i;
> +	QSORT(table, writer.selected_nr, table_cmp);
> +	for (i = 0; i < writer.selected_nr; i++)
> +		table_inv[table[i]] = i;

Right... so table[0] will give us the index into writer.selected of the
commit with the earliest OID in lexicographic order. And table_inv goes
the other way around: table_inv[i] will tell us the lexicographic
position of the commit at writer.selected[i].

> +	for (i = 0; i < writer.selected_nr; i++) {
> +		struct bitmapped_commit *selected = &writer.selected[table[i]];
> +		struct object_id *oid = &selected->commit->object.oid;
> +
> +		hashwrite(f, oid->hash, the_hash_algo->rawsz);
> +	}
> +	for (i = 0; i < writer.selected_nr; i++) {
> +		struct bitmapped_commit *selected = &writer.selected[table[i]];
> +
> +		hashwrite_be32(f, offsets[table[i]]);
> +		hashwrite_be32(f, selected->xor_offset
> +			       ? table_inv[table[i] - selected->xor_offset]

...which we need to discover the position of the XOR'd bitmap. Though
I'm not sure if I remember why `table[i] - selected->xor_offset` is
right and not `i - selected->xor_offset`.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux