Re: A look at some alternative PACK file encodings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 6 Sep 2006, A Large Angry SCM wrote:

> Jon Smirl wrote:
> > On 9/6/06, A Large Angry SCM <gitzilla@xxxxxxxxx> wrote:
> >> TREE objects do not delta or deflate well.
> > 
> > I can understand why they don't deflate, the path names are pretty
> > much unique and the sha1s are incompressible. By why don't they delta
> > well? Does sorting them by size mess up the delta process?
> 
> My guess would be the TREEs would only delta well against other TREE
> versions for the same path.

That's what you'd normally have in a real project, though. I wonder if 
your "pack mashup" lost the normal behaviour: we very much sort trees 
together normally, thanks to the "sort-by-filename, then by size" 
behaviour that git-pack-objects should have (for trees, the size normally 
shouldn't change, so the sorting should basically boil down to "sort the 
same directory together, keeping the ordering it had from git-rev-list").

Btw, that "keeping the ordering it had" part I'm not convinced we actually 
enforce. That would depend on the sort algorithm used by "qsort()", I 
think. So there might be room for improvement there in order to keep 
things in recency order.

> Just looking at the structures in non-BLOBS, I see a lot of potential
> for the use of a set dictionaries when deflating TREEs and another set
> of dictionaries when deflating COMMITs and TAGs. The low hanging fruit
> is to create dictionaries of the most referenced IDs across all TREE or
> COMMIT/TAG objects.

Is there any way to get zlib to just generate a suggested dictionary from 
a given set of input?

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]