On Sat, Aug 12, 2017 at 04:07:50PM +0200, Martin Ågren wrote: > On 12 August 2017 at 10:47, Martin Koegler <martin.koegler@xxxxxxxxx> wrote: > "size" is handed over to a "git_zstream" and goes through zlib.c, > eventually ending up in zlib, which is outside Git's control, and which > seems to work with "uLong"s. How do these kind of changes interact with > zlib? For example, I wonder about this line further down in get_data: > > if (stream.total_out == size && ret == Z_STREAM_END) > > If total_out isn't converted, I guess this would never hit if "size" is > too large. And if total_out /is/ converted, I guess we'd risk truncation I posted a patch changing git_zstream. > in zlib_pre_call in zlib.c. Maybe that might cause Git and zlib to have > different ideas about how much data is available and/or should be > processed. Maybe we could then hit things like this in git.c: > > if (s->z.total_out != s->total_out + bytes_produced) > die("BUG: total_out mismatch"); > > I am not very familiar with zlib, so apologies if this is just noise... You are right, if sizeof(size_t) != sizeof(unsigned long), there can be truncations. Currently, an object size is read/passed as ulong including to the memory allocation functions. (x)malloc & Co take a length - so the whole GIT code might assume a larger object size than the memory allocation functions. Migrating everything to size_t means, that we move the truncation locations to places, where values enter/leave GIT. My patches are just a starting point to fix the size handling. They concentrate of fixing data types - not avoiding any possible overflow. Merging them will already be a challenging task, because they touch many functions and will likely conflict with other changes (eg. moving functions). Regards, Martin