Re: [PATCH 6/8] midx-write.c: support reading an existing MIDX with `packs_to_include`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 29, 2024 at 09:48:20AM +0200, Patrick Steinhardt wrote:
> On Thu, May 23, 2024 at 12:38:19PM -0400, Taylor Blau wrote:
> > Avoid unconditionally copying all packs from an existing MIDX into a new
> > MIDX by checking that packs added via `fill_packs_from_midx()` don't
> > appear in the `to_include` set, if one was provided.
> >
> > Do so by calling `should_include_pack()` from both `add_pack_to_midx()`
> > and `fill_packs_from_midx()`.
>
> This is missing an explanation why exactly we want that. Is the current
> behaviour a bug? Is it a preparation for a future change? Is this change
> expected to modify any existing behaviour?
>
> Reading through the patch we now unconditionally load the existing MIDX
> when writing a new one, but I'm not exactly sure what the effect of that
> is going to be.

Very fair. The short answer is that this is a prerequisite for the
incremental MIDX series that I'm working on. The longer answer is that
an incremental MIDX-aware writer needs to be able to consult with the
existing MIDX (if one exists) and exclude any objects which already
appear in an earlier layer of the MIDX. This is done because we cannot
have the same object appear in multiple layers of the MIDX, for reasons
that are probably not interesting to this series.

I put a more concise version of the explanation above into this patch
which I'll send another round of in v2 of this series shortly.

> [snip]
> > diff --git a/midx-write.c b/midx-write.c
> > index 9712ac044f..36ac4ab65b 100644
> > --- a/midx-write.c
> > +++ b/midx-write.c
> > @@ -101,27 +101,13 @@ struct write_midx_context {
> >  };
> >
> >  static int should_include_pack(const struct write_midx_context *ctx,
> > -			       const char *file_name)
> > +			       const char *file_name,
> > +			       int exclude_from_midx)
> >  {
> > -	/*
> > -	 * Note that at most one of ctx->m and ctx->to_include are set,
> > -	 * so we are testing midx_contains_pack() and
> > -	 * string_list_has_string() independently (guarded by the
> > -	 * appropriate NULL checks).
> > -	 *
> > -	 * We could support passing to_include while reusing an existing
> > -	 * MIDX, but don't currently since the reuse process drags
> > -	 * forward all packs from an existing MIDX (without checking
> > -	 * whether or not they appear in the to_include list).
> > -	 *
> > -	 * If we added support for that, these next two conditional
> > -	 * should be performed independently (likely checking
> > -	 * to_include before the existing MIDX).
> > -	 */
> > -	if (ctx->m && midx_contains_pack(ctx->m, file_name))
> > +	if (exclude_from_midx && ctx->m && midx_contains_pack(ctx->m, file_name))
> >  		return 0;
> > -	else if (ctx->to_include &&
> > -		 !string_list_has_string(ctx->to_include, file_name))
> > +	if (ctx->to_include && !string_list_has_string(ctx->to_include,
> > +						       file_name))
>
> The second branch is a no-op change, right? The only change was that you
> converted from `else if` to `if`. I'd propose to either keep this as-is,
> or to do this change in the preceding patch already that introduces this
> function.

It is a no-op, but I would rather keep these separate to keep the
previous step a pure code movement rather than introducing any textual
changes.

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux