When we're checking a pack with fsck or verify-pack, we first sort the idx entries by offset, since accessing them in pack order is more efficient. To do so, we loop over them and fill in an array of structs with the offset, object_id, and index position of each, sort the result, and only then do we iterate over the sorted array and process each entry. In order to avoid the memory cost of storing the hash of each object, we just store a pointer into the copy in the mmap'd pack index file. To keep that property even as the rest of the code converted to "struct object_id", commit 9fd750461b (Convert the verify_pack callback to struct object_id, 2017-05-06) introduced a union in order to type-pun the pointer-to-hash into an object_id struct. But we can make this even simpler by observing that the sort operation doesn't need the object id at all! We only need them one at a time while we actually process each entry. So we can just omit the oid from the struct entirely and load it on the fly into a local variable in the second loop. This gets rid of the type-punning, and lets us directly use the more type-safe nth_packed_object_id(), simplifying the code. And as a bonus, it saves 8 bytes of memory per object. Note that this does mean we'll do the offset lookup for each object before the oid lookup. The oid lookup has more safety checks in it (e.g., for looking past p->num_objects) which in theory protected the offset lookup. But since violating those checks was already a BUG() condition (as described in the previous commit), it's not worth worrying about. Signed-off-by: Jeff King <peff@xxxxxxxx> --- pack-check.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/pack-check.c b/pack-check.c index 39196ecfbc..dad6d8ae7f 100644 --- a/pack-check.c +++ b/pack-check.c @@ -8,10 +8,6 @@ struct idx_entry { off_t offset; - union idx_entry_object { - const unsigned char *hash; - struct object_id *oid; - } oid; unsigned int nr; }; @@ -97,30 +93,31 @@ static int verify_packfile(struct repository *r, entries[nr_objects].offset = pack_sig_ofs; /* first sort entries by pack offset, since unpacking them is more efficient that way */ for (i = 0; i < nr_objects; i++) { - entries[i].oid.hash = nth_packed_object_sha1(p, i); - if (!entries[i].oid.hash) - BUG("unable to get oid of object %lu from %s", - (unsigned long)i, p->pack_name); entries[i].offset = nth_packed_object_offset(p, i); entries[i].nr = i; } QSORT(entries, nr_objects, compare_entries); for (i = 0; i < nr_objects; i++) { void *data; + struct object_id oid; enum object_type type; unsigned long size; off_t curpos; int data_valid; + if (nth_packed_object_id(&oid, p, entries[i].nr) < 0) + BUG("unable to get oid of object %lu from %s", + (unsigned long)entries[i].nr, p->pack_name); + if (p->index_version > 1) { off_t offset = entries[i].offset; off_t len = entries[i+1].offset - offset; unsigned int nr = entries[i].nr; if (check_pack_crc(p, w_curs, offset, len, nr)) err = error("index CRC mismatch for object %s " "from %s at offset %"PRIuMAX"", - oid_to_hex(entries[i].oid.oid), + oid_to_hex(&oid), p->pack_name, (uintmax_t)offset); } @@ -143,14 +140,14 @@ static int verify_packfile(struct repository *r, if (data_valid && !data) err = error("cannot unpack %s from %s at offset %"PRIuMAX"", - oid_to_hex(entries[i].oid.oid), p->pack_name, + oid_to_hex(&oid), p->pack_name, (uintmax_t)entries[i].offset); - else if (check_object_signature(r, entries[i].oid.oid, data, size, type_name(type))) + else if (check_object_signature(r, &oid, data, size, type_name(type))) err = error("packed %s from %s is corrupt", - oid_to_hex(entries[i].oid.oid), p->pack_name); + oid_to_hex(&oid), p->pack_name); else if (fn) { int eaten = 0; - err |= fn(entries[i].oid.oid, type, size, data, &eaten); + err |= fn(&oid, type, size, data, &eaten); if (eaten) data = NULL; } -- 2.25.1.823.g95c5488cf7