Re: [PATCH] zlib.c: use size_t for size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 12, 2018 at 04:07:25PM +0900, Junio C Hamano wrote:

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index e6316d294d..b9ca04eb8a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -266,15 +266,15 @@ static void copy_pack_data(struct hashfile *f,
>  		struct packed_git *p,
>  		struct pack_window **w_curs,
>  		off_t offset,
> -		off_t len)
> +		size_t len)
>  {
>  	unsigned char *in;
> -	unsigned long avail;
> +	size_t avail;

I know there were a lot of comments about "maybe this off_t switch is
not good". Let me say something a bit stronger: I think this part of the
change is strictly worse.

copy_pack_data() looks like this right now:

  static void copy_pack_data(struct hashfile *f,
                  struct packed_git *p,
                  struct pack_window **w_curs,
                  off_t offset,
                  off_t len)
  {
          unsigned char *in;
          unsigned long avail;
  
          while (len) {
                  in = use_pack(p, w_curs, offset, &avail);
                  if (avail > len)
                          avail = (unsigned long)len;
                  hashwrite(f, in, avail);
                  offset += avail;
                  len -= avail;
          }
  }

So right now let's imagine that off_t is 64-bit, and "unsigned long" is
32-bit (e.g., 32-bit system, or an IL32P64 model like Windows). We'll
repeatedly ask use_pack() for a window, and it will tell us how many
bytes we have in "avail". So even as a 32-bit value, that just means
we'll process chunks smaller than 4GB, and this is correct (or at least
this part of it -- hold on). But we can still process the whole "len"
given by the off_t eventually.

But by switching away from off_t in the function interface, we risk
truncation before we even enter the loop. Because of the switch to
size_t, it actually works on an IL32P64 system (because size_t is big
there), but it has introduced a bug on a true 32-bit system. If your
off_t really is 64-bit (and it generally is because we #define
_FILE_OFFSET_BITS), the function will truncate modulo 2^32.

And nor will most compilers warn without -Wconversion. You can try it
with this on Linux:

  #define _FILE_OFFSET_BITS 64
  #include <unistd.h>
  
  void foo(size_t x);
  void bar(off_t x);
  
  void bar(off_t x)
  {
  
  	foo(x);
  }

That compiles fine with "gcc -c -m32 -Wall -Werror -Wextra" for me.
Adding "-Wconversion" catches it, but our code base is not close to
compiling with that warning enabled.

So I don't think this hunk is actually fixing any problems, and is
actually introducing one.

I do in general support moving to size_t over "unsigned long". Switching
avail to size_t makes sense here. It's just the off_t part that is
funny.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux