Re: [PATCH 3/7] builtin/bugreport.c: avoid size_t overflow

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 03 Aug 2022 14:25:23 +0200

On Tue, Aug 02 2022, Victoria Dye wrote:

> Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Mon, Aug 01 2022, Victoria Dye via GitGitGadget wrote:
>> 
>>> From: Victoria Dye <vdye@xxxxxxxxxx>
>>>
>>> Avoid size_t overflow when reporting the available disk space in
>>> 'get_disk_info' by casting the block size and available block count to
>>> 'uint64_t' before multiplying them. Without this change, 'st_mult' would
>>> (correctly) report size_t overflow on 32-bit systems at or exceeding 2^32
>>> bytes of available space.
>>>
>>> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx>
>>> ---
>>>  builtin/bugreport.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/builtin/bugreport.c b/builtin/bugreport.c
>>> index 35b1fc48bf1..720889a37ad 100644
>>> --- a/builtin/bugreport.c
>>> +++ b/builtin/bugreport.c
>>> @@ -258,7 +258,7 @@ static int get_disk_info(struct strbuf *out)
>>>  	}
>>>  
>>>  	strbuf_addf(out, "Available space on '%s': ", buf.buf);
>>> -	strbuf_humanise_bytes(out, st_mult(stat.f_bsize, stat.f_bavail));
>>> +	strbuf_humanise_bytes(out, (uint64_t)stat.f_bsize * (uint64_t)stat.f_bavail);
>> 
>> Doesn't this remove the overflow guard on 64 bit systems to support
>> those 32 bit systems?
>> 
>
> It does, but the total disk space available on a system should be able to
> fit into a 64-bit integer. I considered adding an explicit
> 'unsigned_mult_overflows', but decided against it because it's almost
> certainly overkill for such an implausible edge case.

Yeah it's probably overkill, and maybe this is good as-is & we don't
need to worry here.

But that's quite different from what the patch says, it's not "avoid
size_t overflow" but e.g.:

	bugreport.c: don't do size_t overflow check before casting to 32bit

	It's a hassle to support the check on 32 bit systems, and we
	don't think this is something we'll run into in practice [...]

Perhaps?

>> I also don't tthink it's correct that this would "correctly
>> report...". Before this we were simply assuming that "size_t" and
>> "unsigned long" & "fsblkcnt_t" would all yield the same thing.
>> 
>
> The point I was making is that, if your 'size_t' is 32 bits, but you have
> more than ~4GB of disk space available on your system, the result of the
> multiplication will overflow 'size_t'. So, 'st_mult' failing because it
> detects an overflow is "correct", rather than e.g. a false positive.

I think it would be useful to document these assumptions in the commit
message, POSIX just says "blkcnt_t and off_t shall be signed integer
types", and "size_t shall be an unsigned integer type.".

Do other bits of the standard(s) that I've missed say that off_t's
signed type must be double the width of size_t's unsigned, or is it one
of those things that's not standardized but can be relied on in
practice?

We have a related assertion in 37ee680d9b9 (http.postbuffer: allow full
range of ssize_t values, 2017-04-11) (xcurl_off_t()). Perhaps you want
to do something similar to sanity check your assumptions here?

1. https://pubs.opengroup.org/onlinepubs/009604599/basedefs/sys/types.h.html