Re: Decompression speed: zip vs lzo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 10, 2008 at 07:02:39AM +0000, Sam Vilain wrote:
> Sam Vilain wrote:
> > I do really like LZOP as far as compression algorithms go.  It seems a
> > lot faster for not a huge loss in ratio.
> 
> Coincidentally, I read this today on an algorithm (LZMA - same as 7zip)
> which is very slow to compress, high ratio but quick decompression:
>
>   http://use.perl.org/~acme/journal/35330
>
> Which sounds excellent for squeezing those "archive packs" into even
> more ridiculously tiny spaces.

Well, lzma is excellent for *big* chunks of data, but not that impressive for
small files:

$ ll git.c git.c.gz git.c.lzma git.c.lzop
-rw-r--r-- 1 madcoder madcoder 12915 2008-01-09 13:47 git.c
-rw-r--r-- 1 madcoder madcoder  4225 2008-01-10 10:00 git.c.gz
-rw-r--r-- 1 madcoder madcoder  4094 2008-01-10 10:00 git.c.lzma
-rw-r--r-- 1 madcoder madcoder  5068 2008-01-10 09:59 git.c.lzop


And lzma performs really bad if you have few memory available. The "big" secret
of lzma is that it basically works with a huge window to check for repetitive
data, and even decompression needs quite a fair amount of memory, making it a
really bad choice for git IMNSHO.

Though I don't agree with you (and some others) about the fact that gzip is
fast enough. It's clearly a bottleneck in many log related commands where you
would expect it to be rather IO bound than CPU bound.  LZO seems like a fairer
choice, especially since what it makes gain is basically the compression of the
biggest blobs, aka the delta chains heads. It's really unclear to me if we
really gain in compressing the deltas, trees, and other smallish informations.

And when it comes to times, for a big file enough to give numbers, here are the
decompression times (best of 10 runs, smaller is better, second number is the
size of the packed data, original data was 7.8Mo):
  * lzma: 0.374s (2.2Mo)
  * gzip: 0.127s (2.9Mo)
  * lzop: 0.053s (3.2Mo)

For a 300k original file:
  * lzma: 0.022s (124Ko)
  * gzip: 0.008s (144Ko)
  * lzop: 0.004s (156Ko) /* most of the samples were actually 0.005 */

What is obvious to me is that lzop seems to take 10% more space than gzip,
while being around 1.5 to 2 times faster. Of course this is very sketchy and a
real test with git will be better.
-- 
·O·  Pierre Habouzit
··O                                                madcoder@xxxxxxxxxx
OOO                                                http://www.madism.org

Attachment: pgpQ8YSacRgLP.pgp
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux