Re: [PATCH v2 0/5] repack: introduce '--combine-cruft-below-size'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 19, 2025 at 3:52 PM Taylor Blau <me@xxxxxxxxxxxx> wrote:
>
> Here is a small reroll of my series to introduce a new 'repack' flag
> called '--combine-cruft-below-size'. There aren't any functional changes
> in this round, and the changes that do exist are limited to
> documentation and commit message tweaks in the final patch.
>
> A range-diff is included below, as well as the original cover letter:
>
> (This is based on tb/multi-cruft-pack-refresh-fix from Junio's tree,
> which is at 08f612ba70 (builtin/pack-objects.c: freshen objects from
> existing cruft packs, 2025-03-13) at the time of writing).
>
> This series replaces something close to the existing behavior of
> repack's '--max-cruft-size' flag with '--combine-cruft-below-size'.
>
> The new flag is much clearer in its intent and function, and avoids the
> lack of clarity between the two that was discussed in
>
>   <cover.1741648467.git.me@xxxxxxxxxxxx>
>
> The new behavior is as follows:
>
>   - '--max-cruft-size' is a cruft pack-specific override for repack's
>     '--max-pack-size' command-line flag.
>
>   - '--combine-cruft-below-size' instructs repack to only combine cruft
>     packs which are smaller than the given threshold. This will likely
>     result in packs which are larger than the threshold. But that is OK:
>     the point is to limit the size of the individual packs on input, not
>     the size of the outgoing pack.
>
> This series does break the existing behavior of '--max-cruft-size'. But
> I think breaking backwards compatibility here is OK, since the existing
> behavior was so broken to begin with. I'm happy to consider other
> alternatives if others feel that this isn't OK.
>
> The series has an interesting structure that I feel may be worth calling
> out. The first three patches are trivial test cleanups. The fourth patch
> modifies the existing behavior of '--max-cruft-size', but does so while
> keeping some of the old infrastructure around.
>
> That may seem like an unnecessarily complicated approach, but it greatly
> simplifies introducing the new behavior in the following (and final)
> commit. I think that this results in a series that is a little easier to
> review (since we don't see a ton of code being removed in one commit and
> then re-added in another immediately following it). But if others feel
> like this was a mistake, please let me know ;-).
>
> Thanks in advance for your review!
>
> Taylor Blau (5):
>   t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests
>   t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests
>   t/t7704-repack-cruft.sh: consolidate `write_blob()`
>   repack: avoid combining cruft packs with `--max-cruft-size`
>   repack: begin combining cruft packs with `--combine-cruft-below-size`
>
>  Documentation/git-repack.adoc |  21 ++-
>  builtin/repack.c              |  62 +++----
>  t/t5329-pack-objects-cruft.sh | 302 ++++++----------------------------
>  t/t7704-repack-cruft.sh       | 293 ++++++++++++++++++++++++++++++---
>  4 files changed, 355 insertions(+), 323 deletions(-)
>
> Range-diff against v1:
> 1:  0aa8aa65c1 = 1:  0aa8aa65c1 t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests
> 2:  5e8bd3e66e = 2:  5e8bd3e66e t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests
> 3:  b075ad8601 = 3:  b075ad8601 t/t7704-repack-cruft.sh: consolidate `write_blob()`
> 4:  7941997e33 = 4:  7941997e33 repack: avoid combining cruft packs with `--max-cruft-size`
> 5:  7f120c35e9 ! 5:  dee780a2aa repack: begin combining cruft packs with `--combine-cruft-below-size`
>     @@ Commit message
>          Introduce a new flag, '--combine-cruft-below-size' which is a
>          replacement for the old behavior of '--max-cruft-size'. This new flag
>          does explicitly what it says: it combines together cruft packs which are
>     -    smaller than a given threshold, and prohibits repacking ones which are
>     +    smaller than a given threshold, and leaves alone ones which are
>          larger.
>
>          This accomplishes the original intent of '--max-cruft-size', which was
>     @@ Commit message
>          But that's OK: the point isn't to restrict the size of the cruft packs
>          we generate, it's to avoid working with ones that have already grown too
>          large. If repositories still want to limit the size of the generated
>     -    cruft pack(s), they may use '--max-cruft-size' instead.
>     +    cruft pack(s), they may use '--max-cruft-size'.
>
>          There's some minor test fallout as a result of the slight differences in
>          behavior between the old meaning of '--max-cruft-size' and the behavior
>     @@ Documentation/git-repack.adoc: to the new separate pack will be written.
>
>      +--combine-cruft-below-size=<n>::
>      +  When generating cruft packs without pruning, only repack
>     -+  existing cruft packs whose size is strictly less than `<n>`.
>     -+  Cruft packs whose size is greater than or equal to `<n>` are
>     -+  left as-is and not repacked. Useful when you want to avoid
>     -+  repacking large cruft pack(s) in repositories that have many
>     -+  and/or large unreachable objects.
>     ++  existing cruft packs whose size is strictly less than `<n>`,
>     ++  where `<n>` represents a number of bytes, which can optionally
>     ++  be suffixed with "k", "m", or "g". Cruft packs whose size is
>     ++  greater than or equal to `<n>` are left as-is and not repacked.
>     ++  Useful when you want to avoid repacking large cruft pack(s) in
>     ++  repositories that have many and/or large unreachable objects.
>      +
>       --expire-to=<dir>::
>         Write a cruft pack containing pruned objects (if any) to the
>
> base-commit: 08f612ba7000bf181ef6d8baed9ece322e567efd
> --
> 2.49.0.4.ge59cf92f8d

v2 + your comments address all my feedback from v1, so this round
looks good to me.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux