On Wed, 11 Nov 2020 at 20:43, Taylor Blau <me@xxxxxxxxxxxx> wrote: > > A .bitmap file may have a "name hash cache" extension, which puts a > sequence of uint32_t bytes (one per object) at the end of the file. When s/bytes/values/, perhaps? > we see a flag indicating this extension, we blindly subtract the > appropriate number of bytes from our available length. However, if the > .bitmap file is too short, we'll underflow our length variable and wrap > around, thinking we have a very large length. This can lead to reading > out-of-bounds bytes while loading individual ewah bitmaps. > + uint32_t cache_size = st_mult(index->pack->num_objects, sizeof(uint32_t)); Hmm. If `sizeof(size_t)` is 8, then this multiplication can't possibly overflow. A huge value of `num_objects` (say, 0xffffffff) would give a huge return value (0xffffffff<<2) which would be truncated (0xfffffffc). I think? Do we want a `u32_mult()`? > + unsigned char *index_end = index->map + index->map_size - the_hash_algo->rawsz; The addition should be ok or mmap has failed on us. Do we know that we have room for the final hash there so that the subtraction is ok? Yes, from the previous commit, we know we have room for the header, which is even larger. But that's cheating a bit -- see below. > + if (index->map + header_size + cache_size > index_end) > + return error("corrupted bitmap index file (too short to fit hash cache)"); > + index->hashes = (void *)(index_end - cache_size); > + index_end -= cache_size; If the header size we're adding is indeed too large, the addition in the check would be undefined behavior, if I'm not mistaken. In practical terms, with 32-bit pointers and a huge size, we'd wrap around, decide that everything is ok and go on to do the same erroneous subtraction as before. Maybe shuffle a few things over from the left to the right to only make subtractions that we know are ok: if (cache_size > index_end - index->map - header_size) One could substitute for `index_end - index_map` and end up with if (cache_size > index->map_size - header_size - the_hash_algo->rawsz) Maybe that's clearer in a way, or maybe then it's not so obvious that the subtraction that follows matches this check. But I don't think we can fully trust those subtractions. We're subtracting the size of two hashes (one in the header, one in the footer), but after the previous patch, we only know that there's room for one. So probably the previous patch could go + /* + * Verify that we have room for the header and the + * trailing checksum hash, so we can safely subtract + * their sizes from map_size. We can afford to be + * a bit imprecise with the error message. + */ - if (index->map_size < sizeof(*header) + the_hash_algo->rawsz) + if (index->map_size < header_size + the_hash_algo->rawsz) I *think* I've got most of my comments here somewhat right, but I could easily have missed something. Martin