Re: [PATCH 5/5] repack: begin combining cruft packs with `--combine-cruft-below-size`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 19, 2025 at 07:21:01AM -0700, Elijah Newren wrote:
> On Mon, Mar 17, 2025 at 4:00 PM Taylor Blau <me@xxxxxxxxxxxx> wrote:
> >
> > The previous commit changed the behavior of repack's '--max-cruft-size'
> > to specify a cruft pack-specific override for '--max-pack-size'.
> >
> > Introduce a new flag, '--combine-cruft-below-size' which is a
> > replacement for the old behavior of '--max-cruft-size'. This new flag
> > does explicitly what it says: it combines together cruft packs which are
> > smaller than a given threshold, and prohibits repacking ones which are
> > larger.
>
> To me "prohibits" suggests some kind of stronger action that
> potentially persists beyond the end of this operation. Perhaps this
> could be reworded to something like
> s/prohibits repacking ones/leaves alone packs/ ?

Fair enough, I went with "leaves alone ones which are larger".

> > This accomplishes the original intent of '--max-cruft-size', which was
> > to avoid repacking cruft packs larger than the given threshold.
> >
> > The new behavior is slightly different. Instead of building up small
> > packs together until the threshold is met, '--combine-cruft-below-size'
> > packs up *all* cruft packs smaller than the threshold. This means that
> > we may make a pack much larger than the given threshold (e.g., if you
> > aggregate 5 packs which are each 99 MiB in size with a threshold of 100
> > MiB).
> >
> > But that's OK: the point isn't to restrict the size of the cruft packs
> > we generate, it's to avoid working with ones that have already grown too
> > large. If repositories still want to limit the size of the generated
> > cruft pack(s), they may use '--max-cruft-size' instead.
>
> ...but then they wouldn't get any cruft packs being combined.  Did you
> mean s/instead/together with --combine-cruft-below-size/ ?

Oops, yeah; I meant to imply that '--max-cruft-size' was/is still an
option, not that one excludes the other. I went for s/instead//.

> > There's some minor test fallout as a result of the slight differences in
> > behavior between the old meaning of '--max-cruft-size' and the behavior
> > of '--combine-cruft-below-size'. In the test which is now called
> > "--combine-cruft-below-size combines packs", we need to use the new flag
> > over the old one to exercise that test's intended behavior. The
> > remainder of the changes there are to improve the clarity of the
> > comments.
> >
> > Suggested-by: Elijah Newren <newren@xxxxxxxxx>
> > Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
> > ---
> >  Documentation/git-repack.adoc |  8 ++++++++
> >  builtin/repack.c              | 38 +++++++++++++++++++++++------------
> >  t/t7704-repack-cruft.sh       | 22 +++++++++++---------
> >  3 files changed, 46 insertions(+), 22 deletions(-)
> >
> > diff --git a/Documentation/git-repack.adoc b/Documentation/git-repack.adoc
> > index 11db43b1c5..8e6d61aa2f 100644
> > --- a/Documentation/git-repack.adoc
> > +++ b/Documentation/git-repack.adoc
> > @@ -81,6 +81,14 @@ to the new separate pack will be written.
> >         `--max-pack-size` (if any) by default. See the documentation for
> >         `--max-pack-size` for more details.
> >
> > +--combine-cruft-below-size=<n>::
> > +       When generating cruft packs without pruning, only repack
> > +       existing cruft packs whose size is strictly less than `<n>`.
> > +       Cruft packs whose size is greater than or equal to `<n>` are
> > +       left as-is and not repacked. Useful when you want to avoid
> > +       repacking large cruft pack(s) in repositories that have many
> > +       and/or large unreachable objects.
> > +
>
> Does it make sense to modify the documentation for either the
> --max-cruft-szie or --combine-cruft-below-size options to suggest that
> if both are used, it is recommended to make --max-cruft-size twice (or
> more) the value of --combine-cruft-below-size ?

Ehhh... I kind of want to avoid mentioning it TBH. Part of the reason
there is that I think the explanation of "why" is a little too detailed
to spell out meaningfully in the documentation while still keeping it
concise. But more importantly I want to avoid encouraging people to use
the '--max-{pack,cruft}-size' flags to begin with, since there really is
no reason to use them outside of filesystem constraints.

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux