On Fri, Apr 12, 2019 at 04:04:39PM -0700, Rohit Ashiwal via GitGitGadget wrote: > From: Rohit Ashiwal <rohit.ashiwal265@xxxxxxxxx> > > MinGit for Windows comes without `gzip` bundled inside, git-archive uses > `gzip -cn` to compress tar files but for this to work, gzip needs to be > present on the host system. > > In the next commit, we will change the gzip compression so that we no > longer spawn `gzip` but let zlib perform the compression in the same > process instead. > > In preparation for this, we consolidate all the block writes into a > single function. Sounds like a good preparatory step. This part confused me, though: > @@ -38,11 +40,21 @@ static int write_tar_filter_archive(const struct archiver *ar, > #define USTAR_MAX_MTIME 077777777777ULL > #endif > > +/* writes out the whole block, or dies if fails */ > +static void write_block_or_die(const char *block) { > + if (gzip) { > + if (gzwrite(gzip, block, (unsigned) BLOCKSIZE) != BLOCKSIZE) > + die(_("gzwrite failed")); > + } else { > + write_or_die(1, block, BLOCKSIZE); > + } > +} What is gzwrite()? At first I thought this was an out-of-sequence bit of the series, but it turns out that this is a zlib.h interface. So the idea (I think) is that here we introduce a "gzip" variable that is always false, and this first conditional arm is effectively dead code. And then in a later patch we'd set up "gzip" and it would become not-dead. I think it would be less confusing if this just factored out write_block_or_die(), which starts as a thin wrapper and then grows the gzip parts in the next patch. A few nits on the code itself: > +static gzFile gzip; > [...] > + if (gzip) { Is it OK for us to ask about the truthiness of this opaque type? That works if it's really a pointer behind the scenes, but it seems like it would be equally OK for zlib to declare it as a struct. It looks OK in my version of zlib, and that library tends to be fairly conservative so I wouldn't be surprised if it was that way back to the beginning and remains that way for eternity. But it feels like a bad pattern. > + if (gzwrite(gzip, block, (unsigned) BLOCKSIZE) != BLOCKSIZE) This cast is interesting. All of the matching write_or_die() calls are promoting it to a size_t, which is also unsigned. BLOCKSIZE is a constant. Should we be defining it with a "U" in the first place? I doubt it matters much either way from a correctness perspective. I just wonder when I see a cast like that if we're going to get unexpected truncation or similar. -Peff