"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Han-Wen Nienhuys <hanwen@xxxxxxxxxx> > > Signed-off-by: Han-Wen Nienhuys <hanwen@xxxxxxxxxx> > --- > Documentation/technical/reftable.txt | 50 ++++++++++++++++------------ > 1 file changed, 28 insertions(+), 22 deletions(-) > > diff --git a/Documentation/technical/reftable.txt b/Documentation/technical/reftable.txt > index 9fa4657d9ff..ee3f36ea851 100644 > --- a/Documentation/technical/reftable.txt > +++ b/Documentation/technical/reftable.txt > @@ -193,8 +193,8 @@ and non-aligned files. > Very small files (e.g. 1 only ref block) may omit `padding` and the ref Hmph, I am seeing nbsp before '1' and am wondering where it came from. > index to reduce total file size. > > -Header > -^^^^^^ > +Header (version 1) > +^^^^^^^^^^^^^^^^^^ > > A 24-byte header appears at the beginning of the file: > > @@ -215,6 +215,24 @@ used in a stack for link:#Update-transactions[transactions], these > fields can order the files such that the prior file’s > `max_update_index + 1` is the next file’s `min_update_index`. Am I correct to assume that we do not plan to support a repository with mixed set of refs, some referring to a commit with its SHA-1 object name while others using SHA-256 object name? > +Header (version 2) > +^^^^^^^^^^^^^^^^^^ > + > +A 28-byte header appears at the beginning of the file: > + > +.... > +'REFT' > +uint8( version_number = 1 ) Shouldn't this be 2 instead, as v1 lacked the Hash-id field? > +uint24( block_size ) > +uint64( min_update_index ) > +uint64( max_update_index ) > +uint32( hash_id ) > +.... > + > +The header is identical to `version_number=1`, with the 4-byte hash ID > +("sha1" for SHA1 and "s256" for SHA-256) append to the header. Am I correct to assume that SHA-1 repositories are encouraged to use version 2 when the code becomes available? > First ref block > ^^^^^^^^^^^^^^^ > > @@ -671,14 +689,8 @@ Footer > After the last block of the file, a file footer is written. It begins > like the file header, but is extended with additional data. > > -A 68-byte footer appears at the end: > - > .... > - 'REFT' > - uint8( version_number = 1 ) > - uint24( block_size ) > - uint64( min_update_index ) > - uint64( max_update_index ) > + HEADER > > uint64( ref_index_position ) > uint64( (obj_position << 5) | obj_id_len ) > @@ -701,12 +713,16 @@ obj blocks. > * `obj_index_position`: byte position for the start of the obj index. > * `log_index_position`: byte position for the start of the log index. > > +The size of the footer is 68 bytes for version 1, and 72 bytes for > +version 2. > + > Reading the footer > ++++++++++++++++++ > > -Readers must seek to `file_length - 68` to access the footer. A trusted > -external source (such as `stat(2)`) is necessary to obtain > -`file_length`. When reading the footer, readers must verify: > +Readers must first read the file start to determine the version > +number. Then they seek to `file_length - FOOTER_LENGTH` to access the > +footer. A trusted external source (such as `stat(2)`) is necessary to > +obtain `file_length`. When reading the footer, readers must verify: In any case, the size of this patch is pleasant to see---it must be a sign that the previous step was done well not to hardcode the "hash size is 20 bytes" assumption all over the place and instead used "this field holds N+m bytes where N is the size of the hash described in the REFT header" consistently. Nicely done.