On Fri, Feb 28, 2025 at 11:01:12AM +0100, Patrick Steinhardt wrote: > On Tue, Nov 19, 2024 at 05:07:22PM -0500, Taylor Blau wrote: > > diff --git a/pack-bitmap.c b/pack-bitmap.c > > index 4fa9dfc771a..bba9c6a905a 100644 > > --- a/pack-bitmap.c > > +++ b/pack-bitmap.c > > @@ -170,6 +170,15 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) > > return read_bitmap(index->map, index->map_size, &index->map_pos); > > } > > > > +static uint32_t bitmap_non_extended_bits(struct bitmap_index *index) > > +{ > > + if (index->midx) { > > + struct multi_pack_index *m = index->midx; > > + return m->num_objects + m->num_objects_in_base; > > + } > > + return index->pack->num_objects; > > +} > > > > static uint32_t bitmap_num_objects(struct bitmap_index *index) > > { > > if (index->midx) > > Okay, despite counting our own objects, we also need to account for any > objects that the MIDX layer that we depend on may refer to. I assume > that this is basically recursive, and that the base itself would also > account for its next layer, if any. > > What is interesting to see after this commit is what callsites remain > for `bitmap_num_objects()`. Most of them are converted, but some still > exist: > > - `load_bitmap_header()`, where we use it to determine the size of the > hash cache. Makes sense. > > - `pseudo_merge_bitmap_for_commit()`, where we use it to compute the > merged bitmap of a specific commit. This one feels weird to me, I > would have expected to use `bitmap_non_extended_bits()` here. Great question. The reason is that this function determines a bitmap which identifies the parents of a given commit, and that bitmap is compared against the set of pseudo-merge commits we know about in a given layer to determine whether or not we have a matching pseudo-merge. I left a comment to that effect nearby since this is far from obvious (including to me -- I had to take a few minutes to remember how all of this works!). > - `filter_bitmap_blob_limit()`, where we seem to filter through the > bitmap of the current layer. I _think_ it makes sense to retain. > > - `create_bitmap_mapping()`, which feels like it should be converted? I *think* that this is OK because we are only remapping one layer at a time, but I'll have to double check. I'd do so now, but I'm trying to respond as much as I can before my week is over ;-). > It would be nice to document in the commit message why those functions > don't need to be converted to help guide the reader a bit. The conversions here are case-by-case, so I lean towards documenting any non-obvious ones inline with a brief comment (like the adjustment I made above for pseudo_merge_bitmap_for_commit()). > > @@ -491,7 +499,8 @@ static int midx_pack_order_cmp(const void *va, const void *vb) > > const struct midx_pack_key *key = va; > > struct multi_pack_index *midx = key->midx; > > > > - uint32_t versus = pack_pos_to_midx(midx, (uint32_t*)vb - (const uint32_t *)midx->revindex_data); > > + size_t pos = (uint32_t*)vb - (const uint32_t *)midx->revindex_data; > > Micronit: missing space between `uint32_t` and `*`. Great catch, thanks! > > + uint32_t versus = pack_pos_to_midx(midx, pos + midx->num_objects_in_base); > > uint32_t versus_pack = nth_midxed_pack_int_id(midx, versus); > > off_t versus_offset; > > > > Okay, the calculation to calculate the position is basically the same, > but we now also offset the position by the number of objects in > preceding layers. Makes sense. Exactly. Thanks, Taylor