Re: [PATCH v4 4/5] Add reftable library

Han-Wen Nienhuys <hanwen@xxxxxxxxxx> · Tue, 11 Feb 2020 17:40:56 +0100

On Tue, Feb 11, 2020 at 5:32 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Here is what I saw:
>
>     A 24-byte header appears at the beginning of the file:
>
>         'REFT'
>         uint8( format_version )
>         uint24( block_size )
>         uint64( min_update_index )
>         uint64( max_update_index )
>
>     The `format_version` is a byte, and it indicates both the version of the on-disk
>     format, as well as the size of the hash. The hash size is indicated in the MSB
>     of the `format_version`. For the SHA1 hash, `format_version & 0x80 == 0` and all
>     hash values are 20 bytes. For SHA256, `format_version & 0x80 == 1`, and all hash
>     values are 32 bytes. Future hash functions may be added by using more bits at
>     the right.
>
>     The file format version can be extract as `format_version & 0x7f`. Currently,
>     only version 1 is defined.
>
> If you cast in stone that "& 0x7f is the way to extract the
> version", then you cannot promise that you may steal more bits at
> the right of MSB to support more hash functions, as you've reserved
> the rightmost 7 bits already for the version number with 0x7f and
> there are only 8 bits in your byte.
>
> It seems that you are trying to make the format too dense?  Is it
> too much a waste to use a separate word or a byte for hash?  Or
> perhaps declare that format version 1 uses SHA-1, format version 2
> uses SHA-256, etc. (in other words, do we want to support both SHA-1
> and SHA-256 when we are at format version 7)?

I can see a future where we have a different format that allows for
more metadata so we can encode the hash size separately. But maybe
that can be for format v3 and up.

Let's do format v2 = format v1 but with 32-byte hashes.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado