Re: git gc --aggressive led to about 40 times slower "git log --raw"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Duy Nguyen <pclouds@xxxxxxxxx> writes:

> On Tue, Feb 18, 2014 at 3:55 PM, David Kastrup <dak@xxxxxxx> wrote:
>
>> I've seen the same with my ongoing work on git-blame with the current
>> Emacs Git mirror.  Aggressive packing reduces the repository size to
>> about a quarter, but it blows up the system time (mainly I/O)
>> significantly, quite reducing the total benefits of my algorithmic
>> improvements there.
>
> Likely because --aggressive passes --depth=250 to pack-objects. Long
> delta chains could reduce pack size and increase I/O as well as zlib
> processing signficantly.

Increased zlib processing time is one thing, but if it _increases_ I/O,
then it would seem there is a serious impedance mismatch between the
compression scheme and the code relying on it, leading to repeated reads
of blocks only needed for reconstructing dynamic compression
dictionaries.

Compression should reduce rather than increase the total amount of
reads.  So it would seem that either better caching and/or smaller
independent block sizes and/or strategies for sorting the delta chain to
make its resolution require mostly linear reads, and then make sure to
do this in a manner that does not reinitialize the decompression for
accessing each delta that happens to be more or less "in sequence".

Of course, this is assuming that the additional time is spent
uncompressing data rather than navigating directories.

It's actually conceivable that there is quite a bit of potential to get
better performance from unchanged readers by packing stuff in a
different order while still using the same delta chain depth.

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]