When we map in the .idx file, we do only minimum sanity checks to make sure that the .idx file itself has sorted fan-out. We do not check if the object names are sorted, so a bogus .idx could tell us that an object does not exist when it exists in the matching .pack, but that is harmless. Also an unsorted object names will not make our binary search run in circles while looking things up. We do not check if the offset of individual objects are within the corresponding .pack file, either, and nth_packed_object_offset() does return the data read from .idx file that is not checked for sanity. use_pack(), which is the helper used by the callers of nth_packed_object_offset() that finds the offset in the packfile for each object, avoids allowing a read access to mapped pack data beyond the end of it, so it is OK to return bogus value that was read from the .idx file from this function, but there is one computation the function itself does using a possibly bogus value read from the disk: to find out where in the secondary offset table in the .idx file the offset in the packfile is stored. --- This is not even compile tested; I just wanted to prevent people from adding two unnecessary checks to this function following my analysis in the previous message. I think returning bogus value stored in a crafted .idx file from this function is OK, as the offset will be first used by use_pack() and the sanity of the offset, relative to the packfile size, is checked there, and an offset that points to a random point in the packfile will be caught by the pack reading code, either by unpack_compressed_entry() or by patch_delta(), so that is also safe. We do need to check the unprotected access here. Nobody else in the current codepath protects us from this access attempting to read an unmapped memory and segfault. sha1_file.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/sha1_file.c b/sha1_file.c index 73ccd49..8aca1f6 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2458,6 +2458,13 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n) off = ntohl(*((uint32_t *)(index + 4 * n))); if (!(off & 0x80000000)) return off; + + /* 8-byte offset table */ + if ((p->index_size - (8 + 256 * 4 + 28 * p->num_objects + 40)) + < + (off & 0x7fffffff) * 8) + die("offset beyond end of .idx file"); + index += p->num_objects * 4 + (off & 0x7fffffff) * 8; return (((uint64_t)ntohl(*((uint32_t *)(index + 0)))) << 32) | ntohl(*((uint32_t *)(index + 4))); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html