When chasing a REF_DELTA, we need to pull the raw hash bytes out of the mmap'd packfile into an object_id struct. We do that with a raw hashcpy() of the appropriate length (that happens directly now, though before the previous commit it happened inside find_pack_entry_one(), also using a hashcpy). But I think this creates a potentially dangerous situation due to d4d364b2c7 (hash: convert `oidcmp()` and `oideq()` to compare whole hash, 2024-06-14). When using sha1, we'll have uninitialized bytes in the latter part of the object_id.hash buffer, which could fool oideq(), etc. We should use oidread() instead, which correctly zero-pads the extra bytes, as of c98d762ed9 (global: ensure that object IDs are always padded, 2024-06-14). As far as I can see, this has not been a problem in practice because the object_id we feed to find_pack_entry_one() is never used with oideq(), etc. It is being compared to the bytes mmap'd from a pack idx file, which of course do not have the extra padding bytes themselves. So there's no bug here, but this just puzzled me while looking at the code. We should do the more obviously safe thing, both for future-proofing and to avoid confusing readers. Signed-off-by: Jeff King <peff@xxxxxxxx> --- +cc Patrick for any wisdom here. I'd guess that the conversions from c98d762ed9 were found by using ASan, valgrind, or similar with the new oideq() implementation. And so this case, because it actually is safe, was not flagged. packfile.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/packfile.c b/packfile.c index 005ca670b4..9560f0a33c 100644 --- a/packfile.c +++ b/packfile.c @@ -1240,7 +1240,7 @@ off_t get_delta_base(struct packed_git *p, } else if (type == OBJ_REF_DELTA) { /* The base entry _must_ be in the same pack */ struct object_id oid; - hashcpy(oid.hash, base_info, the_repository->hash_algo); + oidread(&oid, base_info, the_repository->hash_algo); base_offset = find_pack_entry_one(&oid, p); *curpos += the_hash_algo->rawsz; } else -- 2.47.0.363.g6e72b256be