Hi all, ZSTD compression patches have been sent in a number of times over the past few years. Every time, someone asks for benchmarks. Every time, someone is concerned about compression time. Sometimes, someone provides benchmarks. But, as far as I can tell, nobody considered the compression parameters, which have a significant impact on compression time and ratio. So, I did some benchmarks myself, including all the compression levels for each compressor. Results: The results are attached as SVG graphs and CSV data. Summary: - compression level, predictably, has a huge impact on compression time. - compression level has virtually no impact on decompression time for lz4, zstd, and some effect on others. interestingly, xz decompresses slightly faster at higher compression levels (perhaps cache-related). - gzip compresses slightly faster than zstd at medium compression levels. - bzip2 sucks: slow compression, very slow decompression, poor ratio. - lzma decompresses slightly faster than xz, but is also slightly larger. - xz is smallest but with very slow compression and decompression. - lz4 decompresses fastest. - zstd is a good balanced default. - 7z is much faster than xz, even with wine overhead. Files: For the kernel, I did "make allmodconfig; sed -i -e '/=m$/d' .config" with a 5.6 kernel and gcc 9.3.0 on x86_64, then concatenated vmlinux.bin and vmlinux.relocs. For the initramfs, I used the Arch Linux fallback initramfs with default hooks. Versions: gzip 1.10 bzip2, a block-sorting file compressor. Version 1.0.8, 13-Jul-2019. xz (XZ Utils) 5.2.5 *** LZ4 command line interface 64-bits v1.9.2, by Yann Collet *** lzop 1.04 LZO library 2.10 *** zstd command line interface 64-bits v1.4.4, by Yann Collet *** 7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21 Notes: I used the userspace versions of the decompressors, not the kernel version. This is particularly relevant for xz, as the kernel xzminidec is significantly slower than xz. pigz is faster than gzip, but I used gzip as a common baseline. 7-Zip was run through wine with a persistent wineserver. I ran the benchmark on a Ryzen 1600, with turbo boost turned off. Each test was run only once, on the basis that any noise wouldn't disrupt the overall curve, and also I don't want to spend hours waiting for the results. The current compression level defaults are: - gzip -9 - bzip2 -9 - lzma -9 - xz --check=crc32 --x86 --lzma2=,dict=32MiB # except on ppc - lzop -9 - lz4 -l -1 My conclusions: - zstd is an improvement on almost all metrics. - bzip2 and lzma should be removed post-haste. - lzo should be removed once zstd is merged. - compression level is important to consider for compression speed: the default lz4 -1 compresses very fast but has a very poor compression ratio. zstd -19 compresses barely better than zstd -18, but takes significantly longer to compress. - compression level should be configurable: lz4 -1 is useful, but so is lz4 -9. zstd -1 is useful, but so is zstd -19. zstd -1 is useful for developers who want kernel builds as fast as possible, zstd -19 for everybody else. - gzip is by far not the fastest compressor (even excluding cat) - modern compressors (xz, lz4, zstd) decompress about as fast for each compression level, only requiring more memory - 7-Zip is much faster than xz, needs more research - 7-Zip BCJ2 is slightly better than xz/BCJ. probably better filters for all archs would be a good area of research, as apparently BCJ/BCJ2 are intended only for 32-bit x86. Thanks, Alex.
Attachment:
kernel-compression-benchmarks.tar.gz
Description: application/compressed-tar