Re: [PATCH v2 01/13] hash: add an algo member to struct object_id

Matheus Tavares Bernardino <matheus.bernardino@xxxxxx> · Fri, 7 May 2021 10:58:42 -0300

Hi, brian

On Sun, Apr 25, 2021 at 10:03 PM brian m. carlson
<sandals@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Now that we're working with multiple hash algorithms in the same repo,
> it's best if we label each object ID with its algorithm so we can
> determine how to format a given object ID. Add a member called algo to
> struct object_id.

In parallel-checkout.c:send_one_item(), I used hashcpy() instead of
oidcpy() to prepare the packet data that is sent to the checkout
workers through a pipe.

I avoided oidcpy() because it would copy the whole GIT_MAX_RAWSZ
bytes, and the last part could be uninitialized, leading to a Valgrind
warning about passing uninitialized bytes to a write() syscall. There
is no real harm in this case, but I wanted to avoid the warning as it
might confuse someone trying to debug this code, me included.

The problem with this approach, of course, is that it will not copy
the new `algo` field, leaving it as zero for all items. So, what do
you think would be best in this situation? Some ideas that came
through my mind were:

1. Make oidcpy() only copy `hash_algos[src->algo].rawsz` bytes. (But
then we would probably need to branch in case `src->algo` is zero,
right?)

2. Reintroduce the oid_pad_buffer() function from your v1, and use it
in parallel-checkout.c:send_one_item(), after oidcpy(). This would
then zero out the copied uninitialized bytes (with the cost of one
additional memcpy() per item, but this might be neglectable here).

3. Use oidcpy() as-is, without additional padding, and let Valgrind
warn. This false-positive warn might not be so problematic after all,
and maybe I'm just overthinking things :)

What do you think?

Thanks,
Matheus