Re: [lorax] Split compression into separate methods; use xz to compress initrd

John Reiser <jreiser@xxxxxxxxxxxx> · Thu, 17 Feb 2011 07:54:41 -0800

On 02/17/2011 06:40 AM, Will Woods wrote:
> The most significant reason we might want to use xz instead of lzma is
> integrity checking - gzip and xz use crc32, lzma has none. 

That's not really true.  If the header is OK and if lzma decompression
reaches EOF on input with the expected state (0==accumulator &&
bytes_written==original_length), then that is an integrity check
that is broadly equivalent to crc32.  lzma decompression is
equivalent to a "arithmetic long division" of the input encoded
representation; crc32 is a "polynomial long division" of the
bitstring.

The value added by crc32 is low.  Because crc32 is orthogonal to
the algorithmic check, then the probability that crc32 catches
an otherwise-undetected error is 2**-32.

The cost of crc32 is high.  crc32 pollutes the data cache, often
equivalent to flushing a major portion of L1.  In the name of speed,
common implementations use many kilobytes of tables.  The adler32
checksum is *MUCH* better: no tables, less code, faster, no cache
pollution.  adler32 is about 1/4096 less powerful (65521/65536)
in detecting impostors.  crc32 is trivial in hardware and has
mindshare.  But in software, crc32 should be replaced by adler32.

-- 

_______________________________________________
Anaconda-devel-list mailing list
Anaconda-devel-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/anaconda-devel-list