Re: [PATCH v2] Custom compression levels for objects and packs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/8/07, Junio C Hamano <junkio@xxxxxxx> wrote:
Dana How <danahow@xxxxxxxxx> writes:
> Add config variables pack.compression and core.loosecompression .
> Loose objects will be compressed using level
>   isset(core.loosecompression) ? core.loosecompression :
>   isset(core.compression) ? core.compression : Z_BEST_SPEED
> and objects in packs will be compressed using level
>   isset(pack.compression) ? pack.compression :
>   isset(core.compression) ? core.compression : Z_DEFAULT_COMPRESSION
> pack-objects also accepts --compression=N which
> overrides the latter expression.

Do you think the above is readable?
  Compression level for loose objects is controlled by variable
  core.loosecompression (or core.compression, if the former is
  missing), and defaults to best-speed.
or something like that?
Your phrasing is much better.

> This applies on top of the git-repack --max-pack-size patchset.
Hmph, that makes the --max-pack-size patchset take this more
trivial and straightforward improvements hostage.  In general,
I'd prefer more elaborate ones based on less questionable
series.
The max-pack-size and pack.compression patches touch the same lines.
I thought my options were:
* Submit independently and make you merge; or
* Make one precede the other.
Since max-pack-size has been out there since April 4 and
the first acceptable version was May 1 (suggested by 0 comments),
I didn't realize it was a "questionable series".

I think it should be straightforward for me to re-submit this
based on current master.

> +     /* differing core & pack compression when loose object -> must recompress */
> +     if (!entry->in_pack && pack_compression_level != zlib_compression_level)
> +             to_reuse = 0;
> +     else
I am not sure if that is worth it, as you do not know if the
loose object you are looking at were compressed with the current
settings.
You do not know for certain, that is correct.  However, config
settings setting unequal compression levels signal that you
care differently about the two cases. (For me,  I want the
compression investment to correspond to the expected lifetime of the file.)
Also,  *if* we have the knobs we want in the config file,
I don't think we're going to be changing these settings all that often.

If I didn't have this check forcing recompression in the pack,
then in the absence of deltification each object would enter the pack
by being copied (in the preceding code block) and pack.compression
would have little effect.  I actually experienced this the very first
time I imported a large dataset into git (I was trying to achieve the
effect of this patch by changing core.compression dynamically,  and
was a bit mystified for a while by the result).

Thus,  if core.loosecompression is set to speed up git-add,  I should
take the time to recompress the object when packing if pack.compression
is different (of course the hit of not doing so will be lessened by
deltification
which forces a new compression).

> diff --git a/cache.h b/cache.h
> index 8e76152..2b3f359 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -283,6 +283,8 @@ extern int warn_ambiguous_refs;
>  extern int shared_repository;
>  extern const char *apply_default_whitespace;
>  extern int zlib_compression_level;
> +extern int core_compression_level;
> +extern int core_compression_seen;

Could we somehow remove _seen?  Perhaps by initializing the
_level to -1?

> +int core_compression_level;
> +int core_compression_seen;

Same here.
I agree completely.  But,  what magic value should I use
to initialize the _level variables so I know they are not set?
All valid settings come from zlib.h through #define's but
there is no "invalid" defined.  Maybe I'll use -99.

Thanks,
--
Dana L. How  danahow@xxxxxxxxx  +1 650 804 5991 cell
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux