Re: [PATCH 10/20] midx: bounds-check large offset chunk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 09, 2023 at 05:05:30PM -0400, Jeff King wrote:
> When we see a large offset bit in the regular midx offset table, we
> use the entry as an index into a separate large offset table (just like
> a pack idx does). But we don't bounds-check the access to that large
> offset table (nor even record its size when we parse the chunk!).
>
> The equivalent code for a regular pack idx is in check_pack_index_ptr().
> But things are a bit simpler here because of the chunked format: we can
> just check our array index directly.
>
> As a bonus, we can get rid of the st_mult() here. If our array
> bounds-check is successful, then we know that the result will fit in a
> size_t (and the bounds check uses a division to avoid overflow
> entirely).
>
> Signed-off-by: Jeff King <peff@xxxxxxxx>
> ---
>  midx.c                      |  8 +++++---
>  midx.h                      |  1 +
>  t/t5319-multi-pack-index.sh | 20 ++++++++++++++++++++
>  3 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/midx.c b/midx.c
> index 7b1b45f381..3e768d0df0 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -180,7 +180,8 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local
>  	if (read_chunk(cf, MIDX_CHUNKID_OBJECTOFFSETS, midx_read_object_offsets, m))
>  		die(_("multi-pack-index required object offsets chunk missing or corrupted"));
>
> -	pair_chunk_unsafe(cf, MIDX_CHUNKID_LARGEOFFSETS, &m->chunk_large_offsets);
> +	pair_chunk(cf, MIDX_CHUNKID_LARGEOFFSETS, &m->chunk_large_offsets,
> +		   &m->chunk_large_offsets_len);
>
>  	if (git_env_bool("GIT_TEST_MIDX_READ_RIDX", 1))
>  		pair_chunk_unsafe(cf, MIDX_CHUNKID_REVINDEX, &m->chunk_revindex);
> @@ -303,8 +304,9 @@ off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos)
>  			die(_("multi-pack-index stores a 64-bit offset, but off_t is too small"));
>
>  		offset32 ^= MIDX_LARGE_OFFSET_NEEDED;
> -		return get_be64(m->chunk_large_offsets +
> -				st_mult(sizeof(uint64_t), offset32));
> +		if (offset32 >= m->chunk_large_offsets_len / sizeof(uint64_t))
> +			die(_("multi-pack-index large offset out of bounds"));
> +		return get_be64(m->chunk_large_offsets + sizeof(uint64_t) * offset32);

Makes sense, and this seems very reasonable. I had a couple of thoughts
on this hunk, one nitpick, and one non-nitpick ;-).

The nitpick is the easy one, which is that I typically think of scaling
some index to produce an offset into some chunk, instead of dividing and
going the other way.

So I probably would have written something like:

    if (st_mult(offset32, sizeof(uint64_t)) >= m->chunk_large_offsets_len)
        die(_("multi-pack-index large offset out of bounds"));

But that is definitely a subjective/minor point, and I would not at all
be sad if you felt differently about it. That said, I do wish for a
little more information in the die() message, perhaps:

    if (st_mult(offset32, sizeof(uint64_t)) >= m->chunk_large_offsets_len)
        die(_("multi-pack-index large offset for %s out of bounds "
              "(%"PRIuMAX" is beyond %"PRIuMAX")"),
            (uintmax_t)(offset32 * sizeof(uint64_t)), /* checked earlier */
            (uintmax_t)m->chunk_large_offsets_len);

I can imagine that for debugging corrupt MIDXs in the future, having
some extra information like the above would give us a better starting
point than popping into a debugger to get the same information.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux