Re: [PATCH v2 15/17] midx: use 64-bit multiplication for chunk sizes

SZEDER Gábor <szeder.dev@xxxxxxxxx> · Sun, 7 Feb 2021 20:50:53 +0100

On Thu, Feb 04, 2021 at 04:00:19PM -0800, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> 
> > From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> >
> > When calculating the sizes of certain chunks, we should use 64-bit
> > multiplication always. This allows us to properly predict the chunk
> > sizes without risk of overflow.
> >
> > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> > ---
> >  midx.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> This one I find somewhat questionable for multiple reasons.
> 
>  * the fourth parameter of add_chunk() is of size_t, not uint64_t;
>    shouldn't the multiplication be done in type size_t instead?
> 
>  * these mutiplications were introduced in "midx: use chunk-format
>    API in write_midx_internal()";

No, that patch also removes lines like: 

-       chunk_offsets[cur_chunk] = chunk_offsets[cur_chunk - 1] + ctx.entries_nr * the_hash_algo->rawsz;

-               chunk_offsets[cur_chunk] = chunk_offsets[cur_chunk - 1] +
-                                          ctx.num_large_offsets * MIDX_CHUNK_LARGE_OFFSET_WIDTH;

So those potentially problematic multiplications were already there
before this series, and in fact trace all the way back to the initial
midx patch series (commits 0d5b3a5ef7 (midx: write object ids in a 
chunk, 2018-07-12) and 662148c435 (midx: write object offsets,
2018-07-12)).

>    that step should use the
>    arithmetic with cast (if necessary) from the start, no?

As it fixes a long-standing issue, it should rather be a bugfix patch
at the beginning of the series.