Re: Delta compression not so effective

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 1, 2017 at 5:51 AM, Marius Storm-Olsen <mstormo@xxxxxxxxx> wrote:
>
> When first importing, I disabled gc to avoid any repacking until completed.
> When done importing, there was 209GB of all loose objects (~670k files).
> With the hopes of quick consolidation, I did a
>     git -c gc.autoDetach=0 -c gc.reflogExpire=0 \
>           -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 \
>           -c gc.rerereunresolved=0 -c gc.pruneExpire=now \
>           gc --prune
> which brought it down to 206GB in a single pack. I then ran
>     git repack -a -d -F --window=350 --depth=250
> which took it down to 203GB, where I'm at right now.

Considering that it was 209GB in loose objects, I don't think it
delta-packed the big objects at all.

I wonder if the big objects end up hitting some size limit that causes
the delta creation to fail.

For example, we have that HASH_LIMIT  that limits how many hashes
we'll create for the same hash bucket, because there's some quadratic
behavior in the delta algorithm. It triggered with things like big
files that have lots of repeated content.

We also have various memory limits, in particular
'window_memory_limit'. That one should default to 0, but maybe you
limited it at some point in a config file and forgot about it?

                                     Linus



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]