On 1/14/2021 2:22 AM, Junio C Hamano wrote: > Taylor Blau <me@xxxxxxxxxxxx> writes: > >> +== pack-*.rev files have the format: >> + >> + - A 4-byte magic number '0x52494458' ('RIDX'). >> + >> + - A 4-byte version identifier (= 1) >> + >> + - A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256) > > These two are presumably 4-byte-wide network byte order integers. > We should spell it out. In the past, we've included the sentence "All multi-byte numbers are in network byte order." to clarify this. These two entries need to specify that the "identifier" is actually an integer. Perhaps these three values could be provided as: +== pack-*.rev files have the format: + + - A 4-byte identifier '0x52494458' ('RIDX'). + + - A 4-byte integer version (= 1) + + - A 4-byte integer hash function version (= 1 for SHA-1, 2 for SHA-256) >> /* >> * load_pack_revindex populates the revindex's internal data-structures for the >> * given pack, returning zero on success and a negative value otherwise. >> + * >> + * If a '.rev' file is present, it is checked for consistency, mmap'd, and >> + * pointers are assigned into it (instead of using the in-memory variant). > > Hmph, I missed where it got checked for consistency, though. If the > file is corrupt and has say duplicated entries, we'd happily grab > the data via get_be32(), for example. Even if the consistency check is just verifying the trailing hash, that seems like something that requires O(N) before performing a lookup. Perhaps this was copied from somewhere else, or means something different? Thanks, -Stolee