On Mon, Mar 17, 2025 at 09:38:23PM -0400, Jeff King wrote: > On Fri, Mar 14, 2025 at 04:18:31PM -0400, Taylor Blau wrote: > > > The pack-bitmap machinery uses `bitmap_for_commit()` to locate the > > EWAH-compressed bitmap corresponding to some given commit object. > > > > Teach this function about incremental MIDX bitmaps by teaching it to > > recur on earlier bitmap layers when it fails to find a given commit in > > the current layer. > > > > The changes to do so are as follows: > > > > - Avoid initializing hash_pos at its declaration, since > > bitmap_for_commit() is now a recursive function and may receive a > > NULL bitmap_index pointer as its first argument. > > > > - In cases where we would previously return NULL (to indicate that a > > lookup failed and the given bitmap_index does not contain an entry > > corresponding to the given commit), recursively call the function on > > the previous bitmap layer. > > This makes sense, though it does make me wonder if we could/should store > a (midx/pack,pos) pair. I.e., a master hash table stored once for the > whole midx stack. And then you wouldn't need to recurse; it would just > be a single lookup. > > Or would that work badly with the lazy nature? You'd need to load all of > the layers to fill it (rather than doing each incrementally). OTOH, if > you ask for the bitmap for commit X you're eventually going to have to > figure out what's in all of the layers as soon as you have a miss and > have to check them all. And I think the lookup table extension is what's > supposed to make that cheap-ish. I think that it's a good idea, though TBH I think there is even more room for improvement there, like recording cache misses. I suspect the details are fiddly enough that I'd rather tackle them outside of this already-fiddly series, though ;-). Thanks, Taylor