Re: [PATCH 25/44] packfile: compute and use the index CRC offset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 13 May 2020 at 02:56, brian m. carlson
<sandals@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Both v2 pack index files and the v3 format specified as part of the
> NewHash work have similar data starting at the CRC table.  Much of the
> existing code wants to read either this table or the offset entries
> following it, and in doing so computes the offset each time.
>
> In order to share as much code between v2 and v3, compute the offset of
> the CRC table and store it when the pack is opened.  Use this value to
> compute offsets to not only the CRC table, but to the offset entries
> beyond it.

> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1555,13 +1555,9 @@ static void read_v2_anomalous_offsets(struct packed_git *p,
>  {
>         const uint32_t *idx1, *idx2;
>         uint32_t i;
> -       const uint32_t hashwords = the_hash_algo->rawsz / sizeof(uint32_t);
>
>         /* The address of the 4-byte offset table */
> -       idx1 = (((const uint32_t *)p->index_data)
> -               + 2 /* 8-byte header */
> -               + 256 /* fan out */
> -               + hashwords * p->num_objects /* object ID table */
> +       idx1 = (((const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset))
>                 + p->num_objects /* CRC32 table */
>                 );

This counts in four-byte words (so `+ 2` skips ahead 8B as the comment
notes). And that's why we need to use "rawsz/4".

Not new in this patch, but that outer pair of parenthesis just makes
this harder to read, IMHO. I keep scanning back and forth wondering,
"where is this whole thing going to get multiplied or something?"

  idx1 = (const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset)
         + p->num_objects /* CRC32 table */;

The double-casting can be avoided with something like this, but I'm not
sure it's really any better:

  idx1 = (const uint32_t *)p->index_data
         + p->crc_offset/sizeof(uint32_t)
         + p->num_objects /* CRC32 table */;

> --- a/packfile.c
> +++ b/packfile.c
> @@ -178,6 +178,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
>                      */
>                     (sizeof(off_t) <= 4))
>                         return error("pack too large for current definition of off_t in %s", path);
> +               p->crc_offset = 8 + 4 * 256 + nr * hashsz;
>         }
>
>         p->index_version = version;

It doesn't fit in the context, but `nr` will be assigned to
`p->num_objects`. And now we can just use `hashsz` without dividing by
4, so this does the same calculation as the old one above.


Martin



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux