Re: reftable [v7]: new ref storage format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 15, 2017 at 11:15 PM, Stefan Beller <sbeller@xxxxxxxxxx> wrote:
> On Tue, Aug 15, 2017 at 7:48 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
>> 7th iteration of the reftable storage format.
>>
>> You can read a rendered version of this here:
>> https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md
>>
>> Changes from v6:
>> - Blocks are variable sized, and alignment is optional.
>> - ref index is required on variable sized multi-block files.
>>
>> - restart_count/offsets are again at the end of the block.
>> - value_type = 0x3 is only for symbolic references.
>> - "other" files cannot be stored in reftable.
>>
>> - object blocks are explicitly optional.
>> - object blocks use position (offset in bytes), not block id.
>> - removed complex log_chained format for log blocks
>>
>> - Layout uses log, ref file extensions
>> - Described reader algorithm to obtain a snapshot
>
> - back to the old "intra-block index is last"
>   for all block types. ok.

Yes, it simplifies "streaming writers" who don't want to buffer a lot.

> - changed (only ref?) indexes to start char + 3 byte size:
>   Which starting char do object/log indexes have?

All index blocks use 'i'.

> "Unaligned files must include the ref index to support fast lookup."
>
> Why this? I would imagine the client (which has ~5 branches),
> would not need this, but only a ref block, that's it.

The quoted part is I think incomplete. Unaligned files need the ref
index if there is more than one ref block, as there is no way to
divide the space for binary search. A single ref block with 5 branches
does not need the ref index.

> Ctrl-F for 'block_size' reveals nothing is computed
> relative to the block_size in this format, yet we can
> set it to an arbitrary number. If following the spec,
> the reader at $DAY_JOB needs to be able to read
> both aligned and unaligned reftables, despite our plan
> to ever write aligned ref tables, what would the reader
> use the block_size for? (I think we can omit that field
> from the header/footer now, no?)

Its really helpful to be present for the reader to know how to locate
and read blocks. If the ref index is missing and there are multiple
ref blocks in an aligned file, a reader can use block_size to divide
the space and perform binary search. Even when the ref index is
present, the reader can use block_size to issue a disk IO read of
block_size bytes without reading the block_len of the target block
first.

At $DAY_JOB the block_size is tunable by the writer and could change
at any time, so its useful to have it embedded in the output.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux