Re: [PATCH 3/9] Convert unpack-objects to size_t

Martin Koegler <martin.koegler@xxxxxxxxx> · Sun, 13 Aug 2017 20:25:52 +0200

On Sat, Aug 12, 2017 at 04:07:50PM +0200, Martin Ågren wrote:
> On 12 August 2017 at 10:47, Martin Koegler <martin.koegler@xxxxxxxxx> wrote:
> "size" is handed over to a "git_zstream" and goes through zlib.c,
> eventually ending up in zlib, which is outside Git's control, and which
> seems to work with "uLong"s. How do these kind of changes interact with
> zlib? For example, I wonder about this line further down in get_data:
> 
> if (stream.total_out == size && ret == Z_STREAM_END)
> 
> If total_out isn't converted, I guess this would never hit if "size" is
> too large. And if total_out /is/ converted, I guess we'd risk truncation

I posted a patch changing git_zstream.

> in zlib_pre_call in zlib.c. Maybe that might cause Git and zlib to have
> different ideas about how much data is available and/or should be
> processed. Maybe we could then hit things like this in git.c:
> 
> if (s->z.total_out != s->total_out + bytes_produced)
>         die("BUG: total_out mismatch");
> 
> I am not very familiar with zlib, so apologies if this is just noise...

You are right, if sizeof(size_t) != sizeof(unsigned long), there can be truncations.

Currently, an object size is read/passed as ulong including to the memory allocation functions.
(x)malloc & Co take a length - so the whole GIT code might assume a larger object size than the
memory allocation functions.
Migrating everything to size_t means, that we move the truncation locations to places, where values 
enter/leave GIT.

My patches are just a starting point to fix the size handling. They concentrate of fixing data types - 
not avoiding any possible overflow.  Merging them will already be a challenging task, because they 
touch many functions and will likely conflict with other changes (eg. moving functions).

Regards,
Martin