Re: [PATCH] pack-format.txt: more details on pack file format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>> +Deltified representation
>>
>> Does this refer to OFS delta as well as REF deltas?
>
> Yes. Both OFS and REF deltas have the same "body" which is what this
> part is about. The differences between OFS and REF deltas are not
> described (in fact I don't think we describe what OFS and REF deltas
> are at all).

Maybe we should?

>
>>> is a sequence of one byte command optionally
>>> +followed by more data for the command. The following commands are
>>> +recognized:
>>
>> So a Deltified representation of an object is a 6 or 7 in the 3 bit type
>> and then the length. Then a command is shown how to construct
>> the object based on other objects. Can there be more commands?
>>
>>> +- If bit 7 is set, the remaining bits in the command byte specifies
>>> +  how to extract copy offset and size to copy. The following must be
>>> +  evaluated in this exact order:
>>
>> So there are 2 modes, and the high bit indicates which mode is used.
>> You start describing the more complicated mode first,
>> maybe give names to both of them? "direct copy" (below) and
>> "compressed copy with offset" ?
>
> I started to update this more because even this text is hard to get
> even to me. So let's get the background first.
>
> We have a source object somewhere (the object name comes from ofs/ref
> delta's header), basically we have the whole content. This delta
> thingy tells us how to use that source object to create a new (target)
> object.
>
> The delta is actually a sequence of instructions (of variable length).

The previous paragraph and this sentence are great for my understanding.
thanks! (Maybe keep it in a similar form around?)

> One is for copying from the source object.

ok that makes sense. I can think of it as a "HTTP range request", just
optimized for packfiles and the source is inside the same pack.
So it would say "Goto object <sha1> and copy bytes 13-168 here"

> The other copies from the
> delta itself

itself means the same object here, that we are describing here?
or does it mean other deltas?

> (e.g. this is new data in the target which is not
> available anywhere in the source object to copy from).




>
> The instruction looks like this
>
>         bit      0        1        2        3       4      5      6
>   +----------+--------+--------+--------+--------+------+------+------+
>   | 1xxxxxxx | offset | offset | offset | offset | size | size | size |
>   +----------+--------+--------+--------+--------+------+------+------+
>
> Here you can see it in its full form, each box represents a byte. The
> first byte has bit 7 set as mentioned. We can see here that offsets
> (where to copy from in the source object) takes 4 bytes and size (how
> many bytes to copy) takes 3. Offset size size is in LSB order.
>
> The "xxxxxxx" part lets us shrink this down.

.. by indicating how much prefix we can skip and assume it be all zero(?)

> If the offset can fit in
> 16 bits, there's no reason to waste the last two bytes describing
> zero. Each 'x' marks whether the corresponding byte is present.

So for a full instruction (as above), we'd have to

1 1111 111 <4 bytes offset> <3 bytes size>

for smaller instructions we have

1 1100 100 <2 bytes offset> <1 byte size>
and here the offset is in range 0..64k and
the size is 1-255 or 0x10000 ?


Modes to skip bytes in between are not allowed, e.g.
1 1101 101 < 3 bytes of offsets> <2 bytes of size>
and the missing bytes would be assumed to be 0?

> The
> bit number is in the first row. So if you have offset 255 and size 1,
> the instruction is three bytes 10010001b, 255,

Oh it is the other way round, the size will be just one byte,
indicating we can have a range of 1-255 or 0x10000 and an
offset of 0..255.

>
> I think this is a corner case in this format. I think Nico meant to
> specify consecutive bytes: if size is 2 bytes then you have to specify
> _both_ of them even if the first byte could be zero and omitted.

So it is not a mutually exclusive group, but a sequence (similar as in
git-bisect), where we start with 0 and end with exactly one edge
in between (sort of, we can also start with 1, then we have to have
all 1s)

> The implementation detail is, if bit 6 is set but bit 4 is not, then
> the size value is pretty much random. It's only when bit 4 is set that
> we first clear out "size" and start adding bits to it.

That sounds similar to what I spelled out above.

Thanks for taking on the documentation here.
The box with numbers really helped me!

Stefan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux