Re: [PATCH] archive: support compression levels beyond 9

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 09.11.20 um 19:35 schrieb Junio C Hamano:
> René Scharfe <l.s.r@xxxxxx> writes:
>
>> Compression programs like zip, gzip, bzip2 and xz allow to adjust the
>> trade-off between CPU cost and size gain with numerical options from -1
>> for fast compression and -9 for high compression ratio.  zip also
>> accepts -0 for storing files verbatim.  git archive directly support
>> these single-digit compression levels for ZIP output and passes them to
>> filters like gzip.
>>
>> Zstandard additionally supports compression level options -10 to -19, or
>> up to -22 with --ultra.  This *seems* to work with git archive in most
>> cases, e.g. it will produce an archive with -19 without complaining, but
>> since it only supports single-digit compression level options this is
>> the same as -1 -9 and thus -9.
>>
>> Allow git archive to accept multi-digit compression levels to support
>> the full range supported by zstd.  Explicitly reject them for the ZIP
>> format, as otherwise deflateInit2() would just fail with a somewhat
>> cryptic "stream consistency error".
>
> The implementation looks more like "not enable them for the ZIP
> format", but the symptom observable to end-users is exactly
> "explicitly reject", so that's OK ;-)
>
> As with the usual compression levels, this is only about how
> deflator finds a better results, and the stream is understandable by
> any existing inflator, right?

Support for higher levels might have been added in later versions of
Zstandard -- https://github.com/facebook/zstd/blob/dev/CHANGELOG
mentions "Command line utility compatible with high compression levels"
for v.0.4.0.  I'm not aware of other implementations  of the algorithm
than the original one from Facebook, so I don't know how compatible
they are.  It's not a problem we can solve in Git, though.

Side note: Using Zstandard with git archive requires the config setting
tar.tar.zst.command=zstd.

>> diff --git a/archive.c b/archive.c
>> index 3c1541af9e..7a888c5338 100644
>> --- a/archive.c
>> +++ b/archive.c
>> @@ -529,10 +529,12 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
>>  	return 0;
>>  }
>>
>> -#define OPT__COMPR(s, v, h, p) \
>> -	OPT_SET_INT_F(s, NULL, v, h, p, PARSE_OPT_NONEG)
>> -#define OPT__COMPR_HIDDEN(s, v, p) \
>> -	OPT_SET_INT_F(s, NULL, v, "", p, PARSE_OPT_NONEG | PARSE_OPT_HIDDEN)
>> +static int number_callback(const struct option *opt, const char *arg, int unset)
>> +{
>> +	BUG_ON_OPT_NEG(unset);
>> +	*(int *)opt->value = strtol(arg, NULL, 10);
>> +	return 0;
>> +}
>>
>>  static int parse_archive_args(int argc, const char **argv,
>>  		const struct archiver **ar, struct archiver_args *args,
>> @@ -561,16 +563,8 @@ static int parse_archive_args(int argc, const char **argv,
>>  		OPT_BOOL(0, "worktree-attributes", &worktree_attributes,
>>  			N_("read .gitattributes in working directory")),
>>  		OPT__VERBOSE(&verbose, N_("report archived files on stderr")),
>> -		OPT__COMPR('0', &compression_level, N_("store only"), 0),
>> -		OPT__COMPR('1', &compression_level, N_("compress faster"), 1),
>> -		OPT__COMPR_HIDDEN('2', &compression_level, 2),
>> -		OPT__COMPR_HIDDEN('3', &compression_level, 3),
>> -		OPT__COMPR_HIDDEN('4', &compression_level, 4),
>> -		OPT__COMPR_HIDDEN('5', &compression_level, 5),
>> -		OPT__COMPR_HIDDEN('6', &compression_level, 6),
>> -		OPT__COMPR_HIDDEN('7', &compression_level, 7),
>> -		OPT__COMPR_HIDDEN('8', &compression_level, 8),
>> -		OPT__COMPR('9', &compression_level, N_("compress better"), 9),
>> +		OPT_NUMBER_CALLBACK(&compression_level,
>> +			N_("set compression level"), number_callback),
>
> Doubly nice.  Adds a feature while removing lines.
>
> Do we miss the description given in "git archive -h" though?
>
>     usage: git archive [<options>] <tree-ish> [<path>...]
>        or: git archive --list
>        ...
>         -v, --verbose         report archived files on stderr
>         -0                    store only
>         -1                    compress faster
>         -9                    compress better
>

Perhaps; I just couldn't cram it all into a single line.  Showing an
acceptable range would be nice and terse, but that depends on the
compressor.

Hmm, adding an option for passing arbitrary options to the filter and
removing the feature flag ARCHIVER_WANT_COMPRESSION_LEVELS from
archive-tar.c would be cleaner overall.  The latter would be a
regression, though.

René




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux