On Sun, Jul 23, 2017 at 3:56 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: > On Mon, Jul 17, 2017 at 6:43 PM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote: >> On Sun, Jul 16, 2017 at 12:43 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: >>> On Sun, Jul 16, 2017 at 10:33 AM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote: > >> * What would you think about being extravagant and making the >> value_type a full byte? It would make the format a tiny bit easier to >> work with, and would leave room for future enhancements (e.g., >> pseudorefs, peeled symrefs, support for the successors of SHA-1s) >> without having to change the file format dramatically. > > I reran my 866k file with full byte value_type. It pushes up the > average bytes per ref from 33 to 34, but the overall file size is > still 28M (with 64 block size). I think its reasonable to expand this > to the full byte as you suggest. FYI, I went back on this in the v3 draft I posted on Jul 22 in https://public-inbox.org/git/CAJo=hJvxWg2J-yRiCK3szux=eYM2ThjT0KWo-SFFOOc1RkxXzg@xxxxxxxxxxxxxx/ I expanded value_type from 2 bits to 3 bits, but kept it as a bit field in a varint. I just couldn't justify the additional byte per ref in these large files. The prefix compression works well enough that many refs are still able to use only a single byte for the suffix_length << 3 | value_type varint, keeping the average at 33 bytes per ref. The reftable format uses values 0-3, leaving 4-7 available. I reserved 4 for an arbitrary payload like MERGE_HEAD type files.