On 2021-05-14 at 14:36:00, Matheus Tavares wrote: > An object_id storing a SHA-1 name has some unused bytes at the end of > the hash array. Since these bytes are not used, they are usually not > initialized to any value either. However, at > parallel_checkout.c:send_one_item() the object_id of a cache entry is > copied into a buffer which is later sent to a checkout worker through a > pipe write(). This makes Valgrind complain about passing uninitialized > bytes to a syscall. The worker won't use these uninitialized bytes > either, but the warning could confuse someone trying to debug this code; > So instead of using oidcpy(), send_one_item() uses hashcpy() to only > copy the used/initialized bytes of the object_id, and leave the > remaining part with zeros. > > However, since cf0983213c ("hash: add an algo member to struct > object_id", 2021-04-26), using hashcpy() is no longer sufficient here as > it won't copy the new algo field from the object_id. Let's add and use a > new function which meets both our requirements of copying all the > important object_id data while still avoiding the uninitialized bytes, > by padding the end of the hash array in the destination object_id. With > this change, we also no longer need the destination buffer from > send_one_item() to be initialized with zeros, so let's switch from > xcalloc() to xmalloc() to make this clear. > > Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> > --- > > Hi, brian > > I've read the hash transition plan, but I'm not confident to say that I > fully understand it yet, so maybe this patch is not exactly what we need > here. Mainly, I'm not sure I understand in which cases we will have an > object_id.algo that is not the_hash_algo. Is it for the early transition > phases, where we have a SHA-256 repo that accepts user input as SHA-1? Yes, that's correct, as well as for interoperability with remotes using a different hash algorithm. > Also, the object_id's copied here at send_one_item() always come from a > `struct cache_entry`. In this case, can they still have different > `algo`s or do we expect them to be the_hash_algo? No, things in the index should always use the same algorithm.. The patch looks fine to me. -- brian m. carlson (he/him or they/them) Houston, Texas, US
Attachment:
signature.asc
Description: PGP signature