On Wed, Aug 16, 2017 at 12:47 AM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: > On Mon, Aug 14, 2017 at 5:13 AM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote: >> On 08/07/2017 03:47 AM, Shawn Pearce wrote: >>> 6th iteration of the reftable storage format. > [...] >>> #### index record >>> >>> An index record describes the last entry in another block. >>> Index records are written as: >>> >>> varint( prefix_length ) >>> varint( (suffix_length << 3) | 0 ) >>> suffix >>> varint( block_position ) >>> >>> Index records use prefix compression exactly like `ref_record`. >>> >>> Index records store `block_position` after the suffix, specifying the >>> absolute position in bytes (from the start of the file) of the block >>> that ends with this reference. >> >> Is there a reason that the index lists the *last* refname that is >> contained in a block rather than the *first* refname? I can't think of a >> reason to choose one vs. the other, but your choice was initially >> surprising. I don't think it matters either way; I was just curious. > > Yes, there is a reason. When a reader is searching the index block and > discovers a key that is greater than their search needle, they are now > sitting on a record with the block_position for that greater key. By > using the *last* refname the current block_position is the one to seek > to. > > If instead we used *first* refname, the reader would now have to > backtrack to the prior index record to get the block_position out of > that record. Or it has to keep a running "prior_position" local > variable. > > Using last simplifies the reader's code. Ah, OK. I was thinking of this as being a binary search, in which case you *must* see both bracketing records before you are done, and the chances are 50-50 which one you see first. But this search is a little bit different, because the index records within a restart block have to be scanned linearly. So it is much more likely that you see the "before" record followed by the "after" record. Thanks for the explanation. Michael