On Thu, 26 Aug 2010, Stephen John Smoogen wrote: > So I was asked to look at compression on log servers and to see if > changing to xz would save us some space. My test is not comprehensive > but showed what might happen. > > Basic summary. XZ may save us up to 2% over what we are currently > saving but its real advantage is in speed of uncompressing files over > bzip2. [compression may be faster for some files also.] > > File | Size | Gzip | G% | Bunzip2 | B% | XZ | X% > messages.log | 644568 | 10992 | 98.3 | 4856 | 99.3 | 5940 | 99.1 > mail.log | 610816 | 65060 | 89.3 | 40836 | 93.3 | 35536 | 94.5 > TOTAL | 1255384 | 76052 | 93.5 | 45692 | 96.1 | 41476 | 96.5 > > Program | Compression Time | Uncompression Time > GZIP | 00m43.416s | 00m10.033s > BZIP | 10m42.296s | 01m02.525s > XZ | 10m15.937s | 00m12.565s > > > Raw data below > > root@log01 smooge-b]# du -s messages.log mail.log > 644568 messages.log > 610816 mail.log > [root@log01 smooge-b]# time gzip -v -9 messages.log mail.log > messages.log: 98.3% -- replaced with messages.log.gz > mail.log: 89.3% -- replaced with mail.log.gz > > real 0m43.416s > user 0m41.335s > sys 0m1.736s > [root@log01 smooge-b]# du -s messages.log.gz mail.log.gz > 10992 messages.log.gz > 65060 mail.log.gz > [root@log01 smooge-b]# time gunzip -v messages.log.gz mail.log.gz > messages.log.gz: 98.3% -- replaced with messages.log > mail.log.gz: 89.3% -- replaced with mail.log > > real 0m10.033s > user 0m6.948s > sys 0m3.004s > > [root@log01 smooge-b]# time bzip2 -v -9 messages.log mail.log > messages.log: 133.043:1, 0.060 bits/byte, 99.25% saved, 659381328 > in, 4956148 out. > mail.log: 14.961:1, 0.535 bits/byte, 93.32% saved, 624854215 > in, 41766136 out. > > real 10m42.296s > user 10m36.948s > sys 0m1.608s > [root@log01 smooge-b]# du -sc messages.log.bz2 mail.log.bz2 > 4856 messages.log.bz2 > 40836 mail.log.bz2 > 45692 total > [root@log01 smooge-b]# time bunzip2 -v messages.log.bz2 mail.log.bz2 > messages.log.bz2: done > mail.log.bz2: done > > real 1m2.525s > user 0m44.779s > sys 0m4.956s > > [root@log01 smooge-b]# time xz -v -9 messages.log mail.log > messages.log (1/2) > 100.0 % 5,923.6 KiB / 628.8 MiB = 0.009 3.1 MiB/s 3:21 > > mail.log (2/2) > 100.0 % 34.7 MiB / 595.9 MiB = 0.058 1.4 MiB/s 6:53 > > real 10m15.937s > user 10m8.550s > sys 0m3.552s > [root@log01 smooge-b]# du -s messages.log.xz mail.log.xz > 5940 messages.log.xz > 35536 mail.log.xz > [root@log01 smooge-b]# time unxz -v messages.log.xz mail.log.xz > messages.log.xz (1/2) > 100.0 % 5,923.6 KiB / 628.8 MiB = 0.009 140 MiB/s 0:04 > > mail.log.xz (2/2) > 100.0 % 34.7 MiB / 595.9 MiB = 0.058 74 MiB/s 0:08 > > real 0m12.565s > user 0m8.709s > sys 0m3.636s > > It does take a while to grep through the bzipped logs. if you want to re-compress them all i say have at it. -Mike _______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/infrastructure