On Mon, 14 Aug 2006, Jon Smirl wrote:
On 8/14/06, David Lang <dlang@xxxxxxxxxxxxxxxxxx> wrote:
On Mon, 14 Aug 2006, Jon Smirl wrote:
> On 8/14/06, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
>> I still think that this is important to think through: Is it worth a
>> couple of kilobytes (I doubt that it would be as much as 1MB in
_total_),
>> and be on the unsafe side?
>
> The only "unsafe" aspect I see to this is if the global dictionary
> doesn't contain any of the words in the documents being encoded. In
> that case the global dictionary will occupy the short huffman keys
> forcing longer internal keys. The keys for the words in the document
> would be longer by a about a bit on average.
the other factor that was mentioned was that a single-bit corruption in the
dictionary would make the entire pack file useless. if this is really a
concern
then just store multiple copies of the dictionary. on a pack with lots of
files
in it it can still be a significant win.
Bit errors can mess the pack up in lots of ways. If it hits a commit
you won't be able to follow the tree back in time. Packs were never
designed to be error tolerant.
I'm not claiming that this is a problem, I'm reponding to other people's claim
that useing a global dictionary for a pack is a problem becouse if something
happens to that dictionary the whole pack is worthless by pointing out that, if
this is viewed as a real problem, it's easy to solve.
David Lang
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html