Re: Compression and dictionaries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/14/06, David Lang <dlang@xxxxxxxxxxxxxxxxxx> wrote:
On Mon, 14 Aug 2006, Jon Smirl wrote:

> On 8/14/06, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
>> I still think that this is important to think through: Is it worth a
>> couple of kilobytes (I doubt that it would be as much as 1MB in _total_),
>> and be on the unsafe side?
>
> The only "unsafe" aspect I see to this is if the global dictionary
> doesn't contain any of the words in the documents being encoded. In
> that case the global dictionary will occupy the short huffman keys
> forcing longer internal keys.  The keys for the words in the document
> would be longer by a about a bit on average.

the other factor that was mentioned was that a single-bit corruption in the
dictionary would make the entire pack file useless. if this is really a concern
then just store multiple copies of the dictionary. on a pack with lots of files
in it it can still be a significant win.

Bit errors can mess the pack up in lots of ways. If it hits a commit
you won't be able to follow the tree back in time. Packs were never
designed to be error tolerant.

--
Jon Smirl
jonsmirl@xxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]