Re: Compression and dictionaries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Mon, 14 Aug 2006, Jon Smirl wrote:

> Does a zlib dictionary just changes the probabilities in the histogram 
> or does it turn the dictionary into a pre-loaded encoding tree?

I have to admit that I do not know zlib well enough to tell off the top of 
my head, but I guess it would make more sense to have it as a preloaded 
encoding tree.

> The other compression schemes I looked at let you load in a
> precomputed huffman/arithmetic encoding tree. By preloading an
> encoding tree you avoid storing the encoding of "void => 010101' in
> every  item. Removing 1M encoding maps and using one common one should
> be a win. Items not in the map would still be stored using internal
> additions to the map.
> 
> Changing the probabilities probably won't help much, but there may be
> good gains from partially eliminating 1M encoding maps.

I _think_ that it would not matter much. The deltas have a more important 
impact.

> > Further, if the pack-file becomes corrupt, you usually still have the 
> > pack index, or the start of the pack-file, and can reconstruct most of 
> > the objects. If you use a dictionary, and just one bit flips in it, 
> > you're screwed.

I still think that this is important to think through: Is it worth a 
couple of kilobytes (I doubt that it would be as much as 1MB in _total_), 
and be on the unsafe side?

Ciao,
Dscho

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]