Re: pack operation is thrashing my server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 06, 2008 at 06:46:29PM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 6 Sep 2008, Junio C Hamano wrote:
> > 
> > This is reproducible  "rev-list --objects --all" in my copy of the kernel
> > repo takes around 47-48 seconds user time, and with the (idiotic) patch it
> > is cut down to 41-42 seconds.
> 
> So I had forgotten about that patch since nobody reacted to it.
> 
> I think the patch is wrong, please don't apply it, even though it does 
> help performance.
> 
> The reason? 
> 
> Right now we depend on "avail_out" also making zlib understand to stop 
> looking at the input stream. Sad, but true - we don't know or care about 
> the compressed size of the object, only the uncompressed size. So in 
> unpack_compressed_entry(), we simply set the output length, and expect 
> zlib to stop when it's sufficient.
> 
> Which it does - but the patch kind of violates that whole design.
> 
> Now, it so happens that things seem to work, probably because the zlib 
> format does have enough synchronization in it to not try to continue past 
> the end _anyway_, but I think this makes the patch be of debatable value.
> 
> I'm starting to hate zlib. I actually spent almost a week trying to clean 
> up the zlib source code and make it something that gcc can compile into 
> clean code, but the fact is, zlib isn't amenable to that. The whole "shift 
> <n> bits in from the buffer" approach means that there is no way to make 
> zlib generate good code unless you are an insanely competent assembly 
> hacker or have tons of registers to keep all the temporaries live in.
> 
> Now, I still do think that all my reasons for choosing zlib were pretty 
> solid (it's a well-tested piece of code and it is _everywhere_ and easy to 
> use), but boy do I wish there had been alternatives. 

I know at least 7-zip has its own gzip compression/decompression code
(though it's C++). Maybe some other tools have theirs too.

Anyways, if it can make a speed difference, it might be worth having a
minimalist custom gzip compression/decompression "library" embedded
withing git.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux