On Mon, Apr 29, 2024 at 04:43:08PM -0400, Taylor Blau wrote: > The pack-bitmap-writer machinery uses a oidmap (backed by khash.h) to > map from commits selected for bitmaps (by OID) to a bitmapped_commit > structure (containing the bitmap itself, among other things like its XOR > offset, etc.) > > This map was initialized at the end of `bitmap_writer_build()`. New > entries are added in `pack-bitmap-write.c::store_selected()`, which is > called by the bitmap_builder machinery (which is responsible for > traversing history and generating the actual bitmaps). > > Reorganize when this field is initialized and when entries are added to > it so that we can quickly determine whether a commit is a candidate for > pseudo-merge selection, or not (since it was already selected to receive > a bitmap, and thus is ineligible for pseudo-merge inclusion). I feel like this last sentence here could use some more explanation as the restriction has never been explained before. Is this a strict requirement, or is this rather "It would be wasted anyway"? > The changes are as follows: > > - Introduce a new `bitmap_writer_init()` function which initializes > the `writer.bitmaps` field (instead of waiting until the end of > `bitmap_writer_build()`). > > - Add map entries in `push_bitmapped_commit()` (which is called via > `bitmap_writer_select_commits()`) with OID keys and NULL values to > track whether or not we *expect* to write a bitmap for some given > commit. > > - Validate that a NULL entry is found matching the given key when we > store a selected bitmap. It would be great if this refactoring went way further. Right now it's quite hard to verify whether the writer has really been initialized in all the right places because it is a global variable. Ideally, the whole interface should be refactored to take the writer as input instead, where `bitmap_writer_init()` would then initialize the local variables. That'd of course be a bigger refactoring and may or may not be a good fit for this patch series. But I'd very much love to see such a refactor as a follow-up series. [snip] > diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c > index c35bc81d00f..9bc41a9e145 100644 > --- a/pack-bitmap-write.c > +++ b/pack-bitmap-write.c > @@ -46,6 +46,11 @@ struct bitmap_writer { > > static struct bitmap_writer writer; > > +void bitmap_writer_init(struct repository *r) > +{ > + writer.bitmaps = kh_init_oid_map(); > +} Given the other safety belts, do we also want to BUG here in case the bitmap has already been initialized? Patrick
Attachment:
signature.asc
Description: PGP signature