On Wed, Jul 17, 2024 at 05:11:54PM -0400, Taylor Blau wrote: > This series implements incremental MIDXs, which allow for storing > a MIDX across multiple layers, each with their own distinct set of > packs. > > This round is mostly unchanged from the previous since there has not yet > been substantial review. But it does rebase to current 'master' (which > is 04f5a52757 (Post 2.46-rc0 batch #2, 2024-07-16), at the time of > writing). > > Importantly, this rebase moves this topic to be based on an ancestor of > 0c5a62f14b (midx-write.c: do not read existing MIDX with > `packs_to_include`, 2024-06-11), which resulted in a non-trivial > conflict prior to this rebase. > > The rest of the topic is unchanged. I don't expect that we'll see much > review here for the next couple of weeks while we are in the -rc phase, > but I figured it would be useful to have it on the list for folks that > are interested in taking a look. > > Thanks in advance for any review! :-) I gave it a pretty thorough look. Everything looks good for the most part. I left a few comments, but mostly just thinking my way through things. The trickiest parts were: - the confusion between when we want local per-layer positions versus global positions within the whole chainfile, or whether functions are operating on a single layer versus the whole chain. I mused a bit on how we could do it differently, but ultimately I'm not sure there any good solutions. - the changes you did make look good, but it's hard to know if there's code lurking that still needs to be adjusted for chained midx's. For that I think I'd turn more towards testing than code review. I'm not sure how much interesting coverage we're getting from the GIT_TEST variable, just because the repositories made in most of the tests are so trivial. I'd love to see the results on a real workload (both a big repo, but also how things behave over days or weeks of repository maintenance done with incremental midxs). I know that can be hard to do, though. -Peff