On Tue, Feb 11, 2020 at 5:32 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > Here is what I saw: > > A 24-byte header appears at the beginning of the file: > > 'REFT' > uint8( format_version ) > uint24( block_size ) > uint64( min_update_index ) > uint64( max_update_index ) > > The `format_version` is a byte, and it indicates both the version of the on-disk > format, as well as the size of the hash. The hash size is indicated in the MSB > of the `format_version`. For the SHA1 hash, `format_version & 0x80 == 0` and all > hash values are 20 bytes. For SHA256, `format_version & 0x80 == 1`, and all hash > values are 32 bytes. Future hash functions may be added by using more bits at > the right. > > The file format version can be extract as `format_version & 0x7f`. Currently, > only version 1 is defined. > > If you cast in stone that "& 0x7f is the way to extract the > version", then you cannot promise that you may steal more bits at > the right of MSB to support more hash functions, as you've reserved > the rightmost 7 bits already for the version number with 0x7f and > there are only 8 bits in your byte. > > It seems that you are trying to make the format too dense? Is it > too much a waste to use a separate word or a byte for hash? Or > perhaps declare that format version 1 uses SHA-1, format version 2 > uses SHA-256, etc. (in other words, do we want to support both SHA-1 > and SHA-256 when we are at format version 7)? I can see a future where we have a different format that allows for more metadata so we can encode the hash size separately. But maybe that can be for format v3 and up. Let's do format v2 = format v1 but with 32-byte hashes. -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Paul Manicle, Halimah DeLaine Prado