Re: A look at some alternative PACK file encodings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> An alternative would be to create a small "placeholder" object that
>> just gives an ID, then refer to it by offset.
>>
>> That would avoid the need for an id/offset bit with every offset,
>> and possibly save more space if the same object was referenced
>> multiple times.
>>
>> And it just seems simpler.

> There are 2 million objects in the Mozilla pack. This table would take:
> 2M *  (20b (sha)  + 10b(object index/overhead) = 60MB
> This 60MB is pretty much incompressible and increases download time.
> 
> Much better if storage of the sha1s can be totally eliminated and
> replaced by something smaller. Alternatively this map could be
> stripped for transmission and rebuilt locally.

Um, I think I wasn't clear.  Objects in a "thin" pack (for network
updating of a different pack) that are referred to but not included
would have stand-ins containing just the object ID.  Objects that *are*
present would simply be present and referred to by offset as usual.

Imagine you have a "thin" pack containing a delta to an object that the
recipient has, so isn't in the pack.  The delta has to specify the
base object somehow.  If the base object is in the pack, you can
specify it by offset.  If it's not, you can either:

- Generalize the base object pointer to allow an object ID option, or
- Provide a pointer to a magic kind of "external reference" pointer
  object.

I was proposing the latter.

For regular packs, such objects wouldn't even be present, because
all base objects are in the pack itself.

And, of course, you'd only create such objects if you needed to,
if there was at least one pointer to them.

Compared to putting the object ID directly in the pointer, it has
Cost:	An extra offset pointer and object header.
	Extra time follwoing the indirection resolving the pointer.
Benefit: Non-indirect object pointers are a bit smaller.
	The code is simpler.
	Second and later references to the same external object are
	another offset, not another 20 bytes.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]