On Sat, Sep 06, 2008 at 06:46:29PM -0700, Linus Torvalds wrote: > > > On Sat, 6 Sep 2008, Junio C Hamano wrote: > > > > This is reproducible "rev-list --objects --all" in my copy of the kernel > > repo takes around 47-48 seconds user time, and with the (idiotic) patch it > > is cut down to 41-42 seconds. > > So I had forgotten about that patch since nobody reacted to it. > > I think the patch is wrong, please don't apply it, even though it does > help performance. > > The reason? > > Right now we depend on "avail_out" also making zlib understand to stop > looking at the input stream. Sad, but true - we don't know or care about > the compressed size of the object, only the uncompressed size. So in > unpack_compressed_entry(), we simply set the output length, and expect > zlib to stop when it's sufficient. > > Which it does - but the patch kind of violates that whole design. > > Now, it so happens that things seem to work, probably because the zlib > format does have enough synchronization in it to not try to continue past > the end _anyway_, but I think this makes the patch be of debatable value. > > I'm starting to hate zlib. I actually spent almost a week trying to clean > up the zlib source code and make it something that gcc can compile into > clean code, but the fact is, zlib isn't amenable to that. The whole "shift > <n> bits in from the buffer" approach means that there is no way to make > zlib generate good code unless you are an insanely competent assembly > hacker or have tons of registers to keep all the temporaries live in. > > Now, I still do think that all my reasons for choosing zlib were pretty > solid (it's a well-tested piece of code and it is _everywhere_ and easy to > use), but boy do I wish there had been alternatives. I know at least 7-zip has its own gzip compression/decompression code (though it's C++). Maybe some other tools have theirs too. Anyways, if it can make a speed difference, it might be worth having a minimalist custom gzip compression/decompression "library" embedded withing git. Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html