On Wed, May 29, 2024 at 09:48:20AM +0200, Patrick Steinhardt wrote: > On Thu, May 23, 2024 at 12:38:19PM -0400, Taylor Blau wrote: > > Avoid unconditionally copying all packs from an existing MIDX into a new > > MIDX by checking that packs added via `fill_packs_from_midx()` don't > > appear in the `to_include` set, if one was provided. > > > > Do so by calling `should_include_pack()` from both `add_pack_to_midx()` > > and `fill_packs_from_midx()`. > > This is missing an explanation why exactly we want that. Is the current > behaviour a bug? Is it a preparation for a future change? Is this change > expected to modify any existing behaviour? > > Reading through the patch we now unconditionally load the existing MIDX > when writing a new one, but I'm not exactly sure what the effect of that > is going to be. Very fair. The short answer is that this is a prerequisite for the incremental MIDX series that I'm working on. The longer answer is that an incremental MIDX-aware writer needs to be able to consult with the existing MIDX (if one exists) and exclude any objects which already appear in an earlier layer of the MIDX. This is done because we cannot have the same object appear in multiple layers of the MIDX, for reasons that are probably not interesting to this series. I put a more concise version of the explanation above into this patch which I'll send another round of in v2 of this series shortly. > [snip] > > diff --git a/midx-write.c b/midx-write.c > > index 9712ac044f..36ac4ab65b 100644 > > --- a/midx-write.c > > +++ b/midx-write.c > > @@ -101,27 +101,13 @@ struct write_midx_context { > > }; > > > > static int should_include_pack(const struct write_midx_context *ctx, > > - const char *file_name) > > + const char *file_name, > > + int exclude_from_midx) > > { > > - /* > > - * Note that at most one of ctx->m and ctx->to_include are set, > > - * so we are testing midx_contains_pack() and > > - * string_list_has_string() independently (guarded by the > > - * appropriate NULL checks). > > - * > > - * We could support passing to_include while reusing an existing > > - * MIDX, but don't currently since the reuse process drags > > - * forward all packs from an existing MIDX (without checking > > - * whether or not they appear in the to_include list). > > - * > > - * If we added support for that, these next two conditional > > - * should be performed independently (likely checking > > - * to_include before the existing MIDX). > > - */ > > - if (ctx->m && midx_contains_pack(ctx->m, file_name)) > > + if (exclude_from_midx && ctx->m && midx_contains_pack(ctx->m, file_name)) > > return 0; > > - else if (ctx->to_include && > > - !string_list_has_string(ctx->to_include, file_name)) > > + if (ctx->to_include && !string_list_has_string(ctx->to_include, > > + file_name)) > > The second branch is a no-op change, right? The only change was that you > converted from `else if` to `if`. I'd propose to either keep this as-is, > or to do this change in the preceding patch already that introduces this > function. It is a no-op, but I would rather keep these separate to keep the previous step a pure code movement rather than introducing any textual changes. Thanks, Taylor