Re: [RFC PATCH] parallel-checkout: send the new object_id algo field to the workers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021-05-14 at 14:36:00, Matheus Tavares wrote:
> An object_id storing a SHA-1 name has some unused bytes at the end of
> the hash array. Since these bytes are not used, they are usually not
> initialized to any value either. However, at
> parallel_checkout.c:send_one_item() the object_id of a cache entry is
> copied into a buffer which is later sent to a checkout worker through a
> pipe write(). This makes Valgrind complain about passing uninitialized
> bytes to a syscall. The worker won't use these uninitialized bytes
> either, but the warning could confuse someone trying to debug this code;
> So instead of using oidcpy(), send_one_item() uses hashcpy() to only
> copy the used/initialized bytes of the object_id, and leave the
> remaining part with zeros.
> 
> However, since cf0983213c ("hash: add an algo member to struct
> object_id", 2021-04-26), using hashcpy() is no longer sufficient here as
> it won't copy the new algo field from the object_id. Let's add and use a
> new function which meets both our requirements of copying all the
> important object_id data while still avoiding the uninitialized bytes,
> by padding the end of the hash array in the destination object_id. With
> this change, we also no longer need the destination buffer from
> send_one_item() to be initialized with zeros, so let's switch from
> xcalloc() to xmalloc() to make this clear.
> 
> Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx>
> ---
> 
> Hi, brian
> 
> I've read the hash transition plan, but I'm not confident to say that I
> fully understand it yet, so maybe this patch is not exactly what we need
> here. Mainly, I'm not sure I understand in which cases we will have an
> object_id.algo that is not the_hash_algo. Is it for the early transition
> phases, where we have a SHA-256 repo that accepts user input as SHA-1? 

Yes, that's correct, as well as for interoperability with remotes using
a different hash algorithm.

> Also, the object_id's copied here at send_one_item() always come from a
> `struct cache_entry`. In this case, can they still have different
> `algo`s or do we expect them to be the_hash_algo?

No, things in the index should always use the same algorithm..

The patch looks fine to me.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux