On Wed, 2 Feb 2005, Sean Middleditch wrote:
I wonder how much time has been spent by assembler-gods trying to optimize bzip2 for various architectures, though...
I don't think that's going to matter, since bzip2 is cache unfriendly - meaning that on most consumer grade CPUs the window of data that's being accessed is too big to fit in the CPU cache.
Because a cache miss (when data is fetched from RAM) can easily take 200-400 cpu cycles, there is no way for a program to both have a big cache footprint and run fast.
Gzip, on the other hand, manipulates a compression dictionary that fits in the cpu cache, so the actual decompression can be done at full speed.
-- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan