On Thu, Aug 3, 2017 at 11:38 AM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote: > On Tue, Aug 1, 2017 at 7:38 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: >> On Tue, Aug 1, 2017 at 6:51 PM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote: >>> On Tue, Aug 1, 2017 at 4:27 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: >>>> On Mon, Jul 31, 2017 at 11:41 PM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote: >>>>> [...] >>>> A couple of other notes about your contrasting design: >>>> ... >>> OBJS blocks can also be >>> unbounded in size if very many references point at the same object, >>> thought that is perhaps only a theoretical problem. >> >> Gah, I missed that in reftable. The block id pointer list could cause >> a single object id to exceed what fits in a block, and that will cause >> the writer to fail unless its caller sets the block size larger. I >> basically assumed this overflow condition is very unlikely, as its not >> common to have a huge number of refs pointing to the same object. > > Given what Peff pointed out, let's just leave this as a varint for OBJS blocks. We discussed this at $DAY_JOB yesterday. We realized that if an obj block has that many ref pointers present, it may be more efficient for a reader to scan all references instead of chasing those pointers individually. Latest draft of reftable now omits the ref pointer list in an obj block if it exceeds the obj block size, which only occurs when a high proportion of the ref blocks contain that SHA-1.