On Thu, Aug 01, 2024 at 05:21:42AM -0400, Jeff King wrote: > On Wed, Jul 17, 2024 at 05:12:01PM -0400, Taylor Blau wrote: > > > The incremental MIDX chain feature is designed around the idea of > > indexing into a concatenated lexicographic ordering of object IDs > > present in the MIDX. > > > > When given an object position, the MIDX machinery needs to be able to > > locate both (a) which MIDX layer contains the given object, and (b) at > > what position *within that MIDX layer* that object appears. > > > > To do this, three new fields are added to the `struct multi_pack_index`: > > > > - struct multi_pack_index *base_midx; > > - uint32_t num_objects_in_base; > > - uint32_t num_packs_in_base; > > > > These three fields store the pieces of information suggested by their > > respective field names. In turn, the `num_objects_in_base` and > > `num_packs_in_base` fields are used to crawl backwards along the > > `base_midx` pointer to locate the appropriate position for a given > > object within the MIDX that contains it. > > OK, so base_midx is a back-pointer. I think in theory you could compute > num_objects_in_base on the fly by doing that crawl yourself, but we'd > want to be able to do it in constant time, rather than O(# of midx)? > > Makes sense. Yep. As you have seen in the later patches in this series, we end up needing to read num_objects_in_base and num_packs_in_base quite frequently, so it's nice to have them precomputed. We could compute them lazily, but they're easy to build up in midx.c::add_midx_to_chain(), where we're already validating that, for e.g.: if (unsigned_add_overflows(midx_chain->num_packs, midx_chain->num_packs_in_base)) { /* ... */ } midx->num_packs_in_base = midx_chain->num_packs + midx_chain->num_packs_in_base; > -Peff Thanks, Taylor