On Fri, Mar 24, 2023 at 12:43:46PM -0700, Junio C Hamano wrote: > > But given that there is only one spot that calls these, that kind of > > refactoring might not be worth it (right now it just uses the magic > > number "6" right before grabbing the data). > > Yeah, it seems most of the callers with SEEK_SET are "I find the > next offset from a table and jump there in preparation for doing > something". I suspect callers with SEEK_CUR would fit in the > read_X() pattern better? From that angle, it smells that the two > kinds of seek functions may want to be split into two different > helpers. Yes, I think the SEEK_SET cases really do need to be doing more checking. AFAICT they are blindly trusting the offsets in the file (which is locally generated, so it's more of a corruption problem than a security one, but still). And this series improves that, which is good (but I still think it should be a die() and not a BUG()). The SEEK_CUR cases in theory could all look like the nice read_be32() I showed earlier, but I think in practice there are a lot of variants (skipping read of index_pos, advancing past size given by ewah_read_mmap(), and so on). And the current code, while ugly, does give more specific error messages (e.g., telling on _which_ commit we found the truncated data). So I dunno. Certainly there could be more consistency in the magic numbers. E.g., in this code: if (bitmap_git->map_size - bitmap_git->map_pos < bitmap_header_size) { error(_("corrupt ewah bitmap: truncated header for bitmap of commit \"%s\""), oid_to_hex(&xor_item->oid)); goto corrupt; } bitmap_git->map_pos += sizeof(uint32_t) + sizeof(uint8_t); xor_flags = read_u8(bitmap_git->map, &bitmap_git->map_pos); bitmap = read_bitmap_1(bitmap_git); There is an assumption that sizeof(uint32_t) + sizeof(uint8_t) is equal to bitmap_header_size - 1. That's not wrong, but it's hard to verify that it's doing the right thing, and it's potentially fragile to changes (though such changes seem unlikely). -Peff